Welcome to Burst
Coming Soon
- EQL - the Burst user facing language
- HYDRA - the Burst code generated, back end, distributed scan language
Welcome to the open source release of the Burst behavioral analysis engine! I very much hope you have as much fun exploring the code of this system as we did developing it. We want to learn from your experience and encourage you to give us your feedback freely. As you dive in, I thought perhaps it's worth setting expectations in regards to the state of this first release.
First of all Burst has been in 24x7 production for many years, handling real world high concurrency ad-hoc, interactive client sessions performing deep analysis of large datasets. In short it has been pushed hard for a long time by diverse, continuous, real world unforgiving workloads. It imports large datasets on demand, in parallel, directly from hundreds of HBASE region servers hosting petabytes of mobile application event data for many thousands of customers.
That being said, because of our limitations as a small team, and the overall sophistication of this approach, our open source release is not a plug and play product that you can download, configure and run in a weekend. We are releasing this to the world now instead of waiting because we believe there is much of interest here for those who are doing this sort of work and supporting this kind of analysis. We have endeavored to document extensively so that even a casual investigation will be of value. But to actually deploy Burst on your compute cluster, loading data from your stores, you will need to invest non trivial time and effort into authoring a custom schema, implementing custom data encoders and data loaders and setting up an appropriate node orchestration for the runtime. We are hoping to have our custom HBASE data importer published in a subsequent release as a generic store example and that others contribute their data stores as this effort progresses. Stay tuned.
-erik freed
Yahoo Distinguished Architect