Archive for

Innovations in Analytics

There are many analytic engines and appliances out there. Yet many of these engines share common technology patterns: In-memory databases, transparent sharding and columnar storage are all technology patterns that appear in many of these products. What I want to concentrate in this article are particular products that bring something innovative to the table. Here … Continue reading

Optimal Compressed Data File Strategies for HDInsight and Azure Data Lake

HDInsight (Microsoft’s canned Azure Hadoop offering) and Azure Data Lake are competing Azure offerings, with many similar features and yet significant differences.

One of the significant differences between the two platforms is their ability to process compressed file formats. This article looks at the similarities and differences between the two and attempts to formulate strategies to gain the maximum performance for each platform. Continue reading

Comparing HDInsight with Azure Big Data Services

Microsoft offers both the Hadoop ecosystem on Azure (which it collectively calls HDInsight) as well as a range of Azure Big Data services. This article attempts to compare and contrast these technologies and to suggest reasons why you might choose one over another. Equivalent Technologies Below is a simplified matrix showing the Azure Big Data … Continue reading