-
Harvest machine data using Hadoop and Hive
A new article on has been published on IBM developerWorks, looking at the basics of processing machine data using Hadoop, from extracting the core data, storing it, and then determining the baselines and trigger points required to identifying worrying trends and points. From the intro: Machine data can come in many different formats and quantities.…
-
Tungsten Replicator 3.0 is Cloudera Enterprise 5 Certified
One of the key platforms I’ve been testing on for the MySQL to Hadoop replication has been Cloudera, largely driven by customer requirements, but it’s also one of the easiest way to get started with Hadoop. What I’m even more pleased about is the fact that we are proud to announce that Tungsten Replicator 3.0 is…
-
Continuent Replication to Hadoop – Now in Stereo!
Hopefully by now you have already seen that we are working on Hadoop replication. I’m happy to say that it is going really well. I’ve managed to push a few terabytes of data and different data sets through into Hadoop on Cloudera, HortonWorks, and Amazon’s Elastic MapReduce (EMR). For those who have been following my…
-
Real-Time Data Loading from MySQL to Hadoop using Tungsten Replicator 3.0 Webinar
To follow-up and describe some of the methods and techniques behind replicating into Hadoop from MySQL in real-time, and how this can be combined into your data workflow, Continuent are running a webinar with me presenting that will go over the details and provide a demo of the data replication process. Real-Time Data Loading from…
-
Parallel Extractor for Provisioning
Coming up as a new feature in Tungsten Replicator (and written by our replicator expert Stephane Giron) is the ability to provision a new database by using data from an existing database. This new feature comes in the form of a tool called the Parallel Extractor. The principles are very simple. On the master side: Start…
-
Using the Continuent Docs
As hopefully has been noticed, the Continuent documentation is achieving a pretty good critical mass. The content of the documentation is always the most important consideration. Secondary is making sure that the information in the documentation can be found, and that when reading, you can hover and click to get relevant information so that you…
-
Intelligent Linking and Indexing in DocBook
One of the issues I have with DocBook XML is that the links are a little forced and manual. By that, I mean that if I have a command, like trepctl, and I used it in a sentence or description, if I want to link trepctl back to the corresponding trepctl page, I have to…
-
MC at Percona Live San Francisco 2014
Now I’m back in the MySQL fold, I’ve got the opportunity to speak at Percona Live again. I’ve always enjoyed speaking at this conference (back when it was known by another name…), although I need to up my game and do the 6 talks I did back in 2009. On the Tuesday afternoon, tutorials day,…
-
Customizing Chunking in DocBook
I love DocBook XML. No, really. But one thing I hate is the way you have to set a global chunking level for your HTML and then live with it. For most documentation, you want to be able to choose whether a conveniently addressable section within a chapter, and then you want to combine it…
-
MySQL to Hadoop Step-By-Step
We had a great webinar on Thursday about replicating from MySQL to Hadoop (watch the whole thing). It was great, but one of the questions at the end was ‘is there an easy way to test’. Sadly we can’t go giving out convenient ready-to-run downloads of these things because of licensing and and other complexities,…