Pass Summit 14 – Preconf (in progress)

Attending the preconf session Big Data: Deploy, Design, and Manage Like a Pro where Buck Woody (web), Adam Jorgensen (web | twitter) and John Welch (web | twitter) is doing their magic with Azure, HDInsight, PowerShell and everything in between.


Great questions from the attendees, and even greater answers.

Some keypoints from Buck is these, but I think they’ve always been relevant, but now even more in regards to Big Data.

  • Always ask the right questions
  • Never select the tech beforehand
  • Always select the TECHNOLOGIES after the questions have been asked and answered
  • Move 1TB data to Azure, DONT DO THAT
  • Send data i a trickle way, incremential data load

Powershell In A NutShell :

  • Scripting language
  • Based on Command Lets
    • Verb Noun
    • DIR becomes – Get child item
  • Variables always starts with a $
    • $Datasource
  • Everything is an OBJECT

John Welch is starting to talk about how to load data into your Azure storage, for this task we’re loading data from and

John has a tool to download the XMLfeed from Twitter and Linkedin, the data needs to be preprocessed on record at at time


  •  Text files need to be in UTF-8 no BOM
  • Records is delimited by newline
  • Formats
    • Several formats can be used
    • Delimited text
    • SEQ. File
    • RCFile / Optimized Column File


More to come

Leave a Reply

One comment

  1. […] Pass Summit 14 – Preconf (in progress) by Kenneth M Nielsen […]