Tip:
Highlight text to annotate it
X
A cluster is a set of virtual machines.
If I had to set this up myself, it would be a lot of time and work. Instead I’ll use the
Windows Azure Portal to create and configure a new Hadoop cluster in just minutes.
Click HDInsight to see the clusters I already have, and their status.
No surprise, I have zero clusters.
Click New >Quick Create.
I name my cluster clusterForTutorial.
The DNS name I assign will determine the address for the cluster in the azurehdinsight.net domain.
I’ve specified clusterForTutorial, so the address of my cluster is http://clusterForTutorial.azurehdinsight.net.
4 nodes is more than enough for now. The price varies according to the number of nodes I select,
so I only select the number I need. If my needs change, I can add more or even subtract some. I enter my
super-secret password which must have at least 10 characters, 1 cap, 1 number, 1 special character.
As I mentioned earlier, I want my cluster and my storage in the same datacenter, this is where I'll run my jobs
and store the data I analyze in HDInsight. Keep in mind that once a Storage account is chosen,
it cannot be changed. If the HDInsight cluster is removed,
the cluster will no longer be available, but the data will be safe in your Windows Azure storage account.
When I click Create Clusters, Windows Azure begins creating my cluster – it may take a while,
for me it took about 6 minutes – I’ll know it’s done when I see the status (on the HDInsight screen) listed as Running.
Behind the scenes, HDinsight is creating and configuring the VMs that together will make up my HDInsight cluster.
It’s also installing components like Apache Hadoop, Hive, Pig, Sqoop, Oozie, HCatalog and
SQL Server JDBC driver.
In the next video in this series, we’ll take a closer look at what was installed.
We’ll also take a tour of the HDInsight Dashboard, review ways to interact with our
cluster, and run our first job.
Please visit windowsazure.com for more information.