Structure:Data 2012

Wednesday, March 21

7:30 AM

REGISTRATION & BREAKFAST

8:30 AM

OPENING REMARKS

Moderated by:
Chris Albrecht

– Creative Director, GigaOM

Speakers: Derrick Harris

– Conference Chair and Writer, GigaOM

Stacey Higginbotham

– Senior Writer, GigaOM

Om Malik

– Founder, GigaOM

8:40 AM

Structuring Decisions from Unstructured Data

What’s the holy grail of big data? Cracking the code of mining unstructured data. Text and otherwise, unstructured data represents the majority of the big data universe’s “dark matter.” But what approaches will work? What are the real benefits, and how can your enterprise start making sense of the vastness of the data? Hear from leaders in the field with real-world approaches, and thoughts about the future of technologies for unstructured data.

Moderated by:
Seth Grimes

– Principal Consultant, Alta Plana

Speakers: Jason Hunter

– Deputy CTO, MarkLogic

Paul Speciale

– VP, Products, Amplidata

Staffan Truve

– CTO and Co-Founder, Recorded Future

9:10 AM

Machine Learning’s Impact on Business Models and Industry Structures

Big Data is putting a premium on insight and the speed of turning that insight into actionable decisions. Today’s analytics require human experts to ask the right questions. This panel will demistify how machine learning can change the equation between data, expertise, and decisions. We’ll look at how the panelists represent several approaches to commercializing machine learning technology. In the process, we’ll look at how to use the technology to become a better data-driven enterprise and how that can change a company’s relative market share within its industry.

Moderated by:
George Gilbert

– Principal, Tech-Alpha and Analyst, GigaOM Pro

Speakers: Currie Boyle

– Distinguished Engineer, IBM

Alexander Gray

– CTO, Skytree

Mok Oh

– Chief Scientist, PayPal

Amarnath Thombre

– SVP, Strategy and Analytics, Match.com

9:45 AM

Querying Google’s Big Query

Google uses big data, Big Compute and “Big Smarts” to huge business advantage. With its recently announced Big Query product, it aims to provide Google-scale data analytics on billion-row data sets to customers. The power of such a tool affords many businesses the ability to access powerful data analytics for the first time. In this fireside chat, we talk to Google about its plans for the product and what impacts and opportunities it will create in the marketplace over the coming months.

Moderated by:
Stacey Higginbotham

– Senior Writer, GigaOM

Speakers: Ju-kay Kwek

– Product Manager, Google Cloud Platform Team, Google

10:00 AM

Smart Tools: Dissect, Digest and Deliver on Big Data

No one really needs an army of IT analysts. A new generation of tools is empowering business users of all abilities to derive value from big data, one digestible bite at a time. Intuitive interfaces on affordable and powerful cloud services mean that the right tools can be effective for the jobs at hand.

Speakers: Rachel Delacour

– CEO and Co-Founder, We Are Cloud

10:10 AM

Big Data, Security and Innovation

Massive amounts of energy data will emerge as utilities in the U.S. add information technology to their grids. But this also could make the grid even more vulnerable to security concerns as that energy data could be hacked by people and nations that could use it to their advantage. Woolsey’s background in defense and venture capital puts him in a unique position to discuss the intersections of big data, security, innovation and energy.

Moderated by:
Katie Fehrenbacher

– Senior Writer, GigaOM

Speakers: R. James Woolsey

– Chairman, Foundation for Defense of Democracies, Venture Partner, Lux Capital Management, and Former Director of Central Intelligence

10:30 AM

BREAK

(Optional Sponsor Workshops)

10:45 AM – EMC Sponsor Workshop – Oceanic Suite

How Big Data and Analytics Are Changing Organizations

The world of big data analytics is disruptive, and it will create new competitive environments. To be competitive, organizations will require new technology with clear implementation strategies, iterative test-and-learn environments and data science talent. Using real-world examples, this session will detail best practices for starting the journey toward big data analytics.

Speakers: Mike Maxey

– Senior Director of Product Marketing, Greenplum, a division of EMC

10:45 AM – LexisNexis Sponsor Workshop – Aquitania West Suite

Finding Fraud Rings Through Relationship Analytics: A healthcare case study with HPCC Systems

A case study on how HPCC Systems detected potential healthcare fraud and collusion for a major healthcare organization leveraging big data and relationship analytics. The case study will focus on the customer challenge, the data pieces leveraged and the results found, which lead to an immediate investigation. HPCC Systems from LexisNexis® Risk Solutions offers a proven, open source, data-intensive supercomputing platform designed for an enterprise to solve big data problems.

Speakers: Bill Fox, JD

– Senior Director, LexisNexis Risk Solutions

Charles Kaminski

– Senior Architect, HPCC Systems

10:45 AM – DataStax Sponsor Workshop – Aquitania East Suite

The Top 5 Factors to Consider When Choosing a Big Data Solution

Do you have your fail-proof big data plan in place? Today most experts agree that big data is the next strategic business initiative in an enterprise and a key to staying ahead of the competition. During this session, Robin Schumacher, an industry expert and author, will cut through the hype and share five clear, actionable goals that will enable you to establish your big data initiative.

Speakers: Robin Schumacher

– VP of Products, DataStax

11:30 AM

The 3V’s of Big Data: Variety, Velocity and Volume

Today, businesses are hungry for data science that is as theoretically strong as it is fine-tuned. This has led to advances in dealing with the 3V’s of big data: Variety, Velocity, and Volume. This talk will map the next frontiers of data science, highlighting its impact on various industries, including healthcare, consumer services, marketing and advertising, social media and more.

Speakers: Nick Weir

– CEO, ChoozOn

11:45 AM

Enterprise Next: How Social Data Will Shape the Enterprise

E-mail without a doubt completely changed the way we work and what is expected of us as workers. As workers bring new mobile technologies and new messaging tools into the workplace, we stand at a time where we are generating huge amounts of data about the way we work, who we work with and what we work on. In this session, we talk to two leading data scientists who are using data to shape the enterprise tools of the future and will learn how data will drive the evolution of enterprise software.

Moderated by:
Mathew Ingram

– Senior Writer, GigaOM

Speakers: David Gutelius

– Chief Social Scientist, Jive Software

Cameran Evans

– Data Scientist, Socialcast by VMware

12:05 PM

Puzzling

Much in the same way a jigsaw puzzle incrementally reveals the big picture, vast information collections come together in a computationally similar way. This context-accumulating process is the essential magic necessary for computers to really make sense of the world. Four experiments in puzzling bring these mechanics to life and bring revelation about what must come next in the area of sensemaking systems. IBM Distinguished Engineer Jeff Jonas creates a framework to help attendees pull together and integrate all these pieces.

Speakers: Jeff Jonas

– Chief Scientist, Entity Analytics, IBM

12:20 PM

Socializing Data in the Enterprise

It must be great when you do not have any legacy database or business intelligence products to protect, such as EMC. They can focus their acquired Greenplum division on innovating the big-data experience. Their recent initiatives have centered on providing a “facebook”-like social interface with the aim to allow data exploration and insight socializing in an enterprise. We talk to their co-founder Scott Yara about EMC’s future ambitions to enable an enterprise to more easily realize value from its legacy of data.

Moderated by:
Om Malik

– Founder, GigaOM

Speakers: Scott Yara

– SVP, Products and Co-Founder, Greenplum, a division of EMC

12:40 PM

Big Ideas

Without big ideas, big data is a big black hole. We bring you a curated selection of speakers and companies to present their shining new innovations.

Speakers: Karthik Kannan

– Co-Founder and VP Products, Cetas Software

12:45 PM

The Trillion-Row Spreadsheet™

We consider the question: What if your spreadsheet could handle a trillion rows? This clarifies the distinction between which aspects of big data problems can be solved by better software engineering within an existing paradigm — and which problems need a new paradigm for data management. We also consider some of the implications of adopting a spreadsheet metaphor for managing large amounts of data.

Speakers: Robert Lefkowitz

– Director of Web Development, 1010data

1:00 PM

LUNCH

(Optional Sponsor Workshops)

1:15 PM – Teradata Aster Sponsor Workshop – Oceanic Suite

Radical Loyalty – Data Science Applied to Marketing

Online retailers can learn from historical customer-loyalty programs when considering how to leverage big data to answer tough questions: How can I make my loyalty program worthwhile for customers? How do I innovate with all the data available? Learn how Barnes & Noble applies data science to customer loyalty and digital marketing.

Speakers: Marc Parrish

– VP of Membership and Customer Retention Marketing, Barnes & Noble

1:15 PM – Aspera Sponsor Workshop – Aquitania West Suite

Enabling the Big Data Cloud

Companies leveraging the cloud for applications and infrastructure are facing challenges in moving big data in and out. The bottleneck in transfer speeds severely limits their ability to leverage the available computing and storage resources. In this session, learn how Aspera powers high-speed transfers to, from and between remote cloud infrastructures.

Speakers: John Heaton

– Director of Sales Engineering, Aspera

1:15 PM – MarkLogic Sponsor Workshop – Aquitania East Suite

Extending the Hadoop Ecosystem with MarkLogic for Real-Time Search and Monitoring

In this workshop we’ll explore how to extend the Hadoop ecosystem with the MarkLogic server to provide real-time search and monitoring capabilities to your big data infrastructure. The workshop will walk through a live application to demonstrate how this is all possible by integrating both platforms into a single architecture.

Speakers: Fernando Mesa

– Principal Technologist, MarkLogic

2:00 PM

Prediction Competitions: Adding the Human Touch to Big Data Problems

Making accurate predictions is the golden goose of data science. The more accurate your predictions, the more competitive advantage you can generate. One phenomenon that has been observed to create the most accurate predictions has been that of marketplaces functioning as prediction markets. In this fireside chat we’ll talk about one of the most disruptive approaches to applying prediction markets to big data problems.

Moderated by:
Ryan Kim

– Staff Writer, GigaOM

Speakers: Eric Huls

– VP, Allstate Insurance Company

Jeremy Howard

– President and Chief Scientist, Kaggle

2:20 PM

Mining the Oil of the 21st Century

“Crude Oil” underwrote the economic and industrial explosion of the 20th century. In the 21st century the new black gold is “Crude Data.” Plentiful and available. When refined, our expectations are that distilled insights will provide new profitability in an information economy. This session features the founder of Opera, a company leading the vanguard to be the “Oil Refinery” for customers through applying machine and human smarts to customers with huge vats of “Crude Data.” As the global industry creates more data, what future profitability can we look forward to?

Moderated by:
Derrick Harris

– Conference Chair and Writer, GigaOM

Speakers: Arnab Gupta

– CEO and Founder, Opera Solutions

2:35 PM

The Future for Hadoop

Hadoop adoption is surging in business. One of the early pioneers in this field is Cloudera. We’ll hear about their successes in the deployment of Hadoop, its continued growth and success, and the challenges Cloudera will face in its stewardship of the open-source project when it conflicts with the demands of the commercial marketplace. We also explore the burgeoning Hadoop ecosystem and what the next 24 months have in store.

Moderated by:
Jo Maitland

– Research Director, GigaOM Pro

Speakers: Michael Olson

– CEO, Cloudera

2:50 PM

Realizing Real-Time Value on the Real-Time Web

To date, big data has been mostly focused on batch and data science workloads. This is about to change with the advent of the real-time web. We’ll discuss why the next big sea-change in big data will be focused on real-time analytics, and why this is critical for delivering compelling user experiences based on consumer intelligence.

Speakers: Todd Papaioannou

– Founder and CEO, Continuuity

3:00 PM

Underwriting for the Underbanked Through Data Mining

ZestCash uses online data to help determine the credit worthiness of new customers, offering a more modern way of underwriting for the underbanked. Instead of relying on tools like FICO scores, ZestCash pulls in a wealth of data to help rank a person’s likelihood of defaulting. Using data like cell phone bill payments or the length of stay at a residence help provide a fuller picture about a person’s ability to pay off loans and enables ZestCash to redefine an underserved, and often exploited, consumer segment. Learn more about how ZestCash is making data work for them.

Moderated by:
Mathew Ingram

– Senior Writer, GigaOM

Speakers: Douglas Merrill

– Founder and CEO, ZestCash

3:15 PM

Big Data: An Augmented Intelligence for Strategic Decision Making

Humans by nature are not well adapted to consuming millions of rows of numbers in a spreadsheet to make time-critical decisions. We need to create new ways for humans to interact with big data and make complex decisions. Images are much better for quickly understanding large volumes of information. Hear how you can take unstructured data, process and annotate it to create mathematical models, which in turn are used to drive a visualization platform for decision makers.

Speakers: Sean Gourley

– Co-Founder and CTO, Quid

3:30 PM

BREAK

(Optional Sponsor Workshops)

3:45 PM – Splunk Sponsor Workshop – Oceanic Suite

Creating Business Value Now from Big Data

Machine-generated big data contains a goldmine of intelligence useful for IT and business, but remains largely untapped. Using real-world examples, this session examines how organizations are laying the groundwork for success and demonstrates the operational value that’s possible from this important source of information.

Speakers: Mark Frost

– BIS Services, PepsiCo

Sanjay Mehta

– VP Product Marketing, Splunk

3:45 PM – Amplidata Sponsor Workshop – Aquitania West Suite

Big Unstructured Data

Big data can be simplified by understanding the differences between big data for analytics and big unstructured data. While they share commonalities in scale, the data and usage models are very different. This session contrasts these big data aspects and describes systems for managing each one in an optimized manner.

Speakers: Paul Speciale

– VP, Products, Amplidata

3:45 PM – RainStor Sponsor Workshop – Aquitania East Suite

Making Your Big Data Analytics Problem Smaller … and More Affordable

How can you manage your big data growth without operating a data center consisting of hundreds of nodes? Do you need to leverage MapReduce and give SQL-based access to your data without copying it out of Hadoop? This session will detail how the right form of compression can dramatically reduce nodes, while simultaneously BOOSTING data access performance – ultimately driving down deployment, integration and your overall total operating costs.

Speakers: Ramon Chen

– VP Product Management, RainStor

4:30 PM

600 Exabytes and Rising: The Future Challenges for Data Centers

The Journal of Science calculated that this year the world is expected to house about 600 exabytes of data. Where does this data reside? And how will you keep up with this exponential growth? Leaders from the largest data centers and storage providers in the world will join together on one panel to discuss how data centers will evolve over the next 18 months.

Moderated by:
Stacey Higginbotham

– Senior Writer, GigaOM

Speakers: Lane Patterson

– CTO, Equinix

Edward Newman

– Senior Director, Consulting, EMC Corporation

George Slessman

– CEO, IO

Jim Smith

– CTO, Digital Realty

4:40 PM

Invitation Only – GigaOM Pro Subscribers – Aquitania West Suite

GigaOM Pro Mapping Session: Hadoop in the Enterprise

The GigaOM Pro Mapping Session is an exclusive industry insight event where participants will interactively create and capture the real-time market. Participants will work directly with some of the leading thinkers in the cloud space to identify/weight major disruption vectors in the Hadoop and enterprise space; evaluate key players in the startup and establishment spheres; produce a predictive analysis of likely disruptors and disrupted companies; and establish actionable next steps.

Speakers: Carl Brooks

– Analyst, Infrastructure Services, Tier 1 Research

David Card

– Research Director, GigaOM Pro

George Gilbert

– Principal, Tech-Alpha and Analyst, GigaOM Pro

Phil Hendrix

– Founder and Director, immr and Analyst, GigaOM Pro

Julie Lockner

– Senior Analyst and VP, Data Management Solutions, Enterprise Strategy Group

Jo Maitland

– Research Director, GigaOM Pro

5:10 PM

Overcoming Fear of Trying

One of the biggest hindrances to companies obtaining benefits from their business analytics initiatives is the impact of behavioral economic factors and cognitive biases within many organizations. Organizational inertia, “tribal wisdoms,” fear of change and imprecise or incorrect assumptions can compound to inhibit value creation, profit maximization, innovation and efficiency. Hear some of the primary factors organizations must combat and methods to overcome these issues to ensure new and ongoing success with business analytics.

Speakers: John Lucker

– Principal, Deloitte

5:25 PM

Big Ideas

Without big ideas, big data is a big black hole. We bring you a curated selection of speakers and companies to present their shining new innovations.

Speakers: Jonathan Gosier

– Founder, metaLayer.com

J. Andrew Rogers

– Founder and CTO, SpaceCurve

5:35 PM

Mining Machine-Generated Data

Imagine a world of connected devices, sensors and machines, each of them constantly “tweeting” and updating its status. Are we envisioning a science-fiction dystopia or looking at technology’s next big feeding frenzy? In this fireside chat, we’ll talk to leading thinkers about the technology challenges we face in dealing with trillions of objects creating petabytes of data in real time. We’ll look at the challenges created by the forthcoming streams of sensor and machine data, and what kinds of opportunities are created for business in a completely connected world.

Moderated by:
Derrick Harris

– Conference Chair and Writer, GigaOM

Speakers: Usman Haque

– Founder and CEO, Pachube.com

Erik Swan

– CTO and Co-Founder, Splunk

5:55 PM

CLOSING REMARKS

6:00 PM

BASHO COCKTAIL RECEPTION

Thursday, March 22

8:00 AM

REGISTRATION & BREAKFAST

9:00 AM

OPENING REMARKS

Speakers: Chris Albrecht

– Creative Director, GigaOM

9:05 AM

Big Data, Bigger Brother

We laud big data when it’s processing data types such as social media feeds, genome-sequencing data and server logs, but would the positive tone change if we were talking about monitoring your every digital interaction while at work to discover questionable behavior? Most people’s tone would, but Cataphora CEO Elizabeth Charnock doesn’t agree, at least when it comes to the workplace. In fact, she thinks that in a world with increasingly larger corporations and distributed workforces, companies will be doing themselves and their employees a big favor by keeping close tabs on what employees are doing.

Moderated by:
Mathew Ingram

– Senior Writer, GigaOM

Speakers: Elizabeth Charnock

– CEO, Cataphora

9:20 AM

Security, Between Algorithms and Legislature

For the last 30 years, we have been shifting from a society that places value on physical goods to a society that increasingly gains most of its value from invisible data. Data now has immeasurable value. As our sophistication in creating and storing data increases, so does that of those who want to steal it. This session will examine several facets of data security and provide a starting point for you to create your own big data security policy.

Moderated by:
Derrick Harris

– Conference Chair and Writer, GigaOM

Speakers: Jim Benedetto

– CTO, Gravity

Ashlie Beringer

– Partner, Gibson, Dunn & Crutcher LLP

9:40 AM

What’s Next for Hadoop: State of the Ecosystem and Changes Afoot

In no time, Hadoop has become the go to solution for big data and analytics. Hadoop and the emerging ecosystem of companies, solutions, and customer deployments have transformed this movement from early adopters to mainstream enterprise. We’ll explore what has changed from a product perspective, where the next crop of Hadoop-focused startups are coming from, and what customers are doing today and in the near future. Be sure to join this comprehensive look a the Hadoop market.

Moderated by:
Jo Maitland

– Research Director, GigaOM Pro

Speakers: Justin Borgman

– CEO and Co-Founder, Hadapt

Mark Cusack

– Chief Architect, RainStor

James Markarian

– EVP and CTO, Informatica

Ari Zilka

– Chief Products Officer, Hortonworks

10:10 AM

After The Hadoopla: Building Hadoop’s Successor

Recently LexisNexis Risk Solutions released the open source code of its core data-processing-and-delivery software that has helped to scale their data business from $0 to $1.4 billion in about ten years. In this session, we talk to the person behind the open source initiative on why the business world and enterprise organizations will need something better than Hadoop, and how the open source developer community will now be engaged in the projects evolution.

Moderated by:
Derrick Harris

– Conference Chair and Writer, GigaOM

Speakers: Armando Escalante

– CTO, HPCC Systems from LexisNexis Risk Solutions

10:25 AM

Big Ideas

Without big ideas, big data is a big black hole. We bring you a curated selection of speakers and companies to present their shining new innovations.

Speakers: Barry Morris

– Founder and CEO, NUODB

10:30 AM

BREAK

10:45 AM – StackIQ Sponsor Workshop – Oceanic Suite

How to Live with the Elephant in your Server Room

Come learn how to take Hadoop from proof-of-concept to full-scale deployment without the usual hassles. Let the experts in big infrastructure deployment and management show you what to watch out for when making the jump from testing to full-scale deployment. Be prepared. Know the issues. Choose the right tools for the job and make your large scale cluster deployment and management easier.

Speakers: Mason Katz

– CTO, StackIQ

Tim McIntire

– President, StackIQ

10:45 AM – Sybase Sponsor Workshop – Aquitania West Suite

Conquering Big Data Analytics in a Heterogeneous World

In this presentation, we will consider all five vectors of big data analytics by exploring various techniques where traditional but progressive technologies such as in-memory technology, column store DBMS and Event Stream Processing are combined with open source frameworks such as Hadoop to exploit the full potential of big data analytics.

Speakers: David Wiseman

– Director Business Development, Sybase, an SAP Company

10:45 AM – SoftLayer Sponsor Workshop – Aquitania East Suite

Big Data, Internet Scale – Building a Global Object Storage Platform

In this workshop, we will discuss the design and implementation of a global object-storage platform with intelligent indexing and search. Learn how SoftLayer architected a platform to deliver object storage at Internet scale, and how Cloudant is delivering a scalable and globally distributed data layer to further extend the platform. Technologies discussed will include OpenStack Swift and Cloudant BigCouch.

Speakers: Marc Jones

– VP of Product Innovation, SoftLayer Technologies

Mike Miller

– Chief Scientist, Cloudant

11:30 AM

Supporting Querying on Multi-Million Events per Second

Applications are emerging that produce event-stream data that is too voluminous to commit to a database and process within the lifetime of the data. We need to rethink the way we can query and derive results. This talk presents an architecture that repurposes SQL into a massively parallel dataflow language that can process massive volumes of data in real-time with a latency of milliseconds.

Speakers: Damian Black

– CEO, SQLstream

11:45 AM

Real Time Data and Decision Making

comScore has built its business on making sense of data. As its addressable market has grown, so has the typical data-set sizes it has to deal with, along with customers demanding quicker insights to make faster decisions. We talk to one of comScore’s founding team and chief technologist about the future of decision-making technology evolving in a context where there is more data, less time, higher business demands, and increasing computing power.

Moderated by:
Om Malik

– Founder, GigaOM

Speakers: Mike Brown

– CTO, comScore

12:05 PM

Applying Mars Mission Rocket Science to Real-Time Decision-Making

This session will trace how a combinatorial programming language developed by a team of MIT scientists to power NASA’s Mars Mission project has successfully been repurposed for an unlikely commercial use – digital advertising. Learn how the original inventor of this programming language applied “signals and systems” thinking to facilitate a digital advertising platform that uses scalable, intelligent machine-learning techniques for big data analytics and automated real-time decision making.

Speakers: Dr. Bill Simmons

– CTO and Co-Founder, DataXu

12:20 PM

Synthesizing Insights and Capitalizing on Consumers’ Digital Signals

Every time a consumer clicks, buys, checks in, or does anything on the Internet, they are creating a digital mark or impression somewhere. Each of these is happening at an ever-increasing rate, creating “digital signal streams.” This panel examines the nature and importance of consumers’ “digital signals,” the challenges inherent in processing them and what key insights can be derived between the different streams. You’ll hear examples of companies successfully inferencing from “Consumer Digital Signals” and what approaches can be taken to protect consumers from inadvertent breaches of their privacy.

Moderated by:
Phil Hendrix

– Founder and Director, immr and Analyst, GigaOM Pro

Speakers: George John

– CEO, Rocket Fuel

Jed Kolko

– Chief Economist and Head of Analytics, Trulia

STS Prasad

– VP, Pricing and Infrastructure, @WalmartLabs

12:45 PM

Real Time Intelligent Systems and Big Data Streams

There is vast potential for improved performance gains using applied predictive analytics, in real time in many domains, and a need for systems that can “anticipate.” The oncoming onslaught of sensor data in addition to enterprise-generated content, all injected into an enterprise, provides abundant opportunities for new insights. For the advanced analytics and data-mining practitioner, what differentiates is the consumption of analytics. Join us for a lively discussion on best practices, including industry examples of how you can use off-the-shelf components to stitch a system together, bolt on a few machine-learning algorithms, and optimize.

Speakers: Zubin Dowlaty

– VP and Head of Innovation and Development, Mu Sigma

1:00 PM

LUNCH

1:15 PM – Logicworks Sponsor Workshop – Oceanic Suite

Big Data in the Cloud: Leveraging Private and Public Clouds for Maximum Uptime and Performance

Logicworks designs and manages complex hosting and cloud infrastructures for clients with critical applications and content. The Logicworks team will share the lessons they’ve learned in designing big data architectures using a combination of public cloud, private cloud and bare metal hosting solutions. They will explain best practices for monitoring, configuring high-availability, storage, backups and other key strategies in managing these complex environments.

Speakers: Jason McKay

– Director of Engineering, Logicworks

Kenneth Ziegler

– President and COO, Logicworks

1:15 PM – Oracle Sponsor Workshop – Aquitania West Suite

Getting Started with Big Data

The first step to unlocking the value of big data is to collect it. In this session we will examine the two technologies most commonly used to acquire big data – Hadoop Distributed File System and NoSQL databases – and discuss the use cases they each address.

Speakers: Ashok Joshi

– Senior Director, Oracle

1:15 PM – Cleversafe Sponsor Workshop – Aquitania East Suite

Store and Analyze Big Data Without Limits

It’s not unrealistic to think that companies looking to mine big data would need to effectively store and analyze exabyte-level data now or in the near future. Learn how dispersed storage technology breaks the barriers of traditional approaches to enable a truly limitless storage system that won’t break the bank.

Speakers: Russ Kennedy

– VP of Product Strategy, Marketing and Customer Solutions, Cleversafe

2:00 PM

Harnessing the Real-Time Web: Wordnik’s Take on Words

We are moving to the next evolution of the consumer web: the real-time web. As consumers demand a much more responsive and tactile experience with websites, the infrastructure powering these must also evolve. In this session, we look at Wordnik and how it has been able to shape its product proposition through utilizing technologies designed with responsiveness in mind. We also talk to them about the co-evolution of the real-time web, real-time big data streams and the NOSQL technologies needed to support these demands.

Moderated by:
David Card

– Research Director, GigaOM Pro

Speakers: Dwight Merriman

– CEO and Co-Founder, 10gen

Tony Tam

– VP of Engineering and Technical Co-Founder, Wordnik

2:20 PM

Big DNA Data in the Cloud

As DNA sequencing becomes increasingly affordable and accessible, the massive amount of genomics data represents opportunities to advance personal medicine. But how do you share a data set so incredibly big? In this talk we explore how a cloud data platform strategy will rapidly enable the sharing and analysis of the world’s DNA data with the innovators who need it most.

Moderated by:
Derrick Harris

– Conference Chair and Writer, GigaOM

Speakers: Andreas Sundquist

– CEO and Co-Founder, DNAnexus

2:35 PM

Computational Storage – Solving the Biggest Problems, Faster

Traditional computational and storage methodologies don’t scale easily or cost-effectively into the current generation of big data problems. Computational Storage is an alternative set of architectures that massively optimize solution design and construction. We’ll explore several test cases in Genomics, Risk Analysis, Logistics and Satellite Imagery Analysis.

Moderated by:
Stacey Higginbotham

– Senior Writer, GigaOM

Speakers: Sultan Meghji

– VP of Cloud Applications, Appistry

2:50 PM

Faster Memory, Faster Compute

High-performance computing has traditionally been the most demanding area of computing that pushed the edge of the envelope. In these circumstances a lot of pioneering work has been done that ultimately filtered down to more pervasive and affordable applications. In this fireside chat we talk to a scientist from Los Alamos National Laboratories about how they reduce the time and cost of getting to critical insights by reducing the distance and latency of compute memory. The conversation will look to successes in current deployments and what this signals for computer architectures for big data problems in the long-term.

Moderated by:
Rich Brueckner

– President, inside-BigData

Speakers: Gary Grider

– Deputy Division Leader, HPC Division, Los Alamos National Laboratory

Garth Gibson

– Co-Founder and CTO, Panasas

3:05 PM

Big Data Security, Big Challenges: Start Here

Security at scale is harder than you’d think, especially when your big data platform is based on Infrastructure as a Service cloud computing. Join us for this introductory fireside chat as we discuss encryption for big data, how virtualization affects the security of big data, and emerging practices that will provide a big boost for big data security.

Moderated by:
Davi Ottenheimer

– President, flyingpenguin and Analyst, GigaOM Pro

Speakers: Dave Asprey

– VP Cloud Security, Trend Micro

3:25 PM

Flash Memory: The Big Data Application Accelerant

Big data isn’t just big – it’s messy. CIOs are faced with the daunting task of unlocking the value of their data efficiently in the time-frame required to make accurate decisions. This session will look at leading Fortune 500 companies that have leveraged solid state technology over mechanical disks to accelerate compute time.

Speakers: Scott Metzger

– VP of Analytics, Violin Memory

3:40 PM

BREAK

3:55PM – Neustar Sponsor Workshop – Oceanic Suite

Consumers vs. Creators: Making Sense of the New Data Marketplace

Neustar’s Peter Kirwan will give a ‘guided tour’ of today’s new Data Marketplace, helping us to navigate the rapidly emerging world of APIs, services, and structured and unstructured data in the cloud. Kirwan will address questions like: Who are the consumers of this data, and who are the creators? How do we integrate data from multiple data sources and make it readily available to the communities of interest? This interactive session will touch on the best tools, sites and methods.

Speakers: Peter Kirwan, Jr.

– VP, Entrepreneur In Residence, Neustar

3:55PM – Canonical Sponsor Workshop – Aquitania West Suite

Distributed Data Done Easily

Come and see live demonstrations of cluster deployment in minutes on the best platform for big data. Ubuntu Server supports leading ISVs, is certified with common hardware and scales to meet your business and budget requirements.

Speakers: Mark Baker

– Server Product Manager, Canonical

3:55PM – Platform Computing Sponsor Workshop – Aquitania East Suite

Getting the Most from Your Hadoop Big Data Cluster

The Hadoop framework is an established solution for big data management and analysis. In practice, Hadoop applications vary significantly. Your data center infrastructure is used by multiple lines of business and multiple differing workloads. This session looks at the requirements for a multi-tenant big data cluster: one where different lines of businesses, different projects and multiple applications can be run with assured SLAs, resulting in higher utilization and ROI for these clusters.

Speakers: Scott Campbell

– Product Manager, Enterprise Analytics, Platform Computing, an IBM Company

Rohit Valia

– Marketing Director, Platform Computing, an IBM Company

4:40 PM

Mining the Mobile Data Deluge

The most successful complex technology of all time is the mobile phone. As everyone on the planet gets a phone, connects to a network and creates data, we sink deeper into an ocean of data. This ocean of data represents a huge opportunity for those willing to submerge into its depth and fish for the insights. We talk to the most innovative new thought leaders in this space about how they are creating new values from the insights they generate and what is still left to explore in the uncharted depths of the mobile data ocean.

Moderated by:
Ryan Kim

– Staff Writer, GigaOM

Speakers: Michael Driscoll

– CTO, Metamarkets

Raj Aggarwal

– CEO and Co-Founder, Localytics

5:10 PM

The Data Warehousing Debate: Traditional vs. Open Source

Find out how a company’s data mining needs can inform its warehousing decision. While traditional warehousing solutions maintain their popularity, engineering-focused companies are starting to switch to Hadoop. Hear the pros and cons of both types of solutions, and address their respective strengths and weaknesses in terms of expense, operational management, scalability, extensibility, reporting, and security as discovered at the Data Discovery group at Eventbrite.

Speakers: Vipul Sharma

– Principal Software Engineer and Engineering Manager, Eventbrite

5:25 PM

Analyzing Large-Scale User Data with Hadoop and HBase

Doing data science at scale on Hadoop and HBase is difficult, and data about users presents unique challenges. In this talk you will learn why user-centric data is different from other types of large data, and how to leverage the right data layout to build an integrated analysis and serving platform. We will describe an architecture designed to perform these analyses (WiBiData), see real-world examples of user-centric data analysis solutions, and talk about the potential future of user data analysis.

Speakers: Aaron Kimball

– CTO, WibiData

5:40 PM

CLOSING REMARKS

Speakers: Derrick Harris

– Conference Chair and Writer, GigaOM

Stacey Higginbotham

– Senior Writer, GigaOM

Om Malik

– Founder, GigaOM

5:45 PM

COCKTAIL RECEPTION

Speaker Lineup

Justin Borgman
CEO and Co-Founder, Hadapt
Mark Cusack
Chief Architect, RainStor
Karthik Kannan
Co-Founder and VP Products, Cetas Software
Amarnath Thombre
SVP, Strategy and Analytics, Match.com
Seth Grimes
Principal Consultant, Alta Plana
Ari Zilka
Chief Products Officer, Hortonworks
J. Andrew Rogers
Founder and CTO, SpaceCurve
Sean Gourley
Co-Founder and CTO, Quid
Vipul Sharma
Principal Software Engineer and Engineering Manager, Eventbrite
Elizabeth Charnock
CEO, Cataphora
Zubin Dowlaty
VP and Head of Innovation and Development, Mu Sigma
Rich Brueckner
President, inside-BigData
Tony Tam
VP of Engineering and Technical Co-Founder, Wordnik
Mok Oh
Chief Scientist, PayPal
George John
CEO, Rocket Fuel
Dwight Merriman
CEO and Co-Founder,10gen
Todd Papaioannou
Founder and CEO, Continuuity
Michael Driscoll
CEO and Co-Founder, Metamarkets
Arnab Gupta
CEO and Founder, Opera Solutions
Ryan Kim
Staff Writer, GigaOM
Ashlie Beringer
Partner, Gibson, Dunn & Crutcher LLP
Jonathan Gosier
Founder, metaLayer.com
Lane Patterson
CTO, Equinix
Gary Grider
Deputy Division Leader, HPC Division, Los Alamos National Laboratory
Edward Newman
Senior Director, Consulting, EMC Corporation
Ju-kay Kwek
Product Manager, Google BigQuery, Google
David Gutelius
Chief Social Scientist, Jive Software
Erik Swan
CTO and Co-Founder, Splunk
Alexander Gray
CTO, Skytree
Rachel Delacour
CEO and Co-Founder, We Are Cloud
Om Malik
Founder, GigaOM
Armando Escalante
CTO, HPCC Systems from LexisNexis Risk Solutions
Barry Morris
Founder and CEO, NUODB
Garth Gibson
Co-Founder and CTO, Panasas
Derrick Harris
Conference Chair and Writer, GigaOM
Davi Ottenheimer
President, flyingpenguin and Analyst, GigaOM Pro
Stacey Higginbotham
Senior Writer, GigaOM
Currie Boyle
Distinguished Engineer, IBM
John Lucker
Principal, Deloitte
Andreas Sundquist
CEO and Co-Founder, DNAnexus
Staffan Truve
CTO and Co-Founder, Recorded Future
STS Prasad
VP, Pricing and Infrastructure, @WalmartLabs
Aaron Kimball
CTO, WibiData
Scott Metzger
VP of Analytics, Violin Memory
Sultan Meghji
VP of Cloud Applications, Appistry
Raj Aggarwal
CEO and Co-Founder, Localytics
George Gilbert
Principal, TechAlpha Partners and GigaOM Pro Analyst
James Markarian
EVP and CTO, Informatica
Eric Huls
VP, Allstate Insurance Company
Jeff Jonas
Chief Scientist, Entity Analytics, IBM
Usman Haque
Founder and CEO, Pachube
David Card
Research Director, GigaOM Pro
George Slessman
CEO, IO
Jim Smith
CTO, Digital Realty
Michael Olson
CEO, Cloudera
Dave Asprey
VP Cloud Security, Trend Micro
Jed Kolko
Chief Economist and Head of Analytics, Trulia
Jason Hunter
Deputy CTO, MarkLogic
Damian Black
CEO and Founder, SQLstream
Katie Fehrenbacher
Senior Writer, GigaOM
R. James Woolsey
Chairman, Foundation for Defense of Democracies, Venture Partner, Lux Capital Management and Former Director of Central Intelligence
Phil Hendrix
Founder and Director, immr and Analyst, GigaOM Pro
Jo Maitland
Research Director, GigaOM Pro
Nick Weir
CEO, ChoozOn
Rob Mee
CEO, Pivotal Labs
Robert Lefkowitz
Director of Web Development, 1010data
Jim Benedetto
CTO, Gravity
Douglas Merrill
Founder and CEO, ZestCash
Jeremy Howard
President and Chief Scientist, Kaggle
Dr. Bill Simmons
CTO and Co-Founder, DataXu
Cameran Evans
Data Scientist, Socialcast by VMware
Scott Yara
SVP, Products and Co-Founder, Greenplum, a division of EMC
Mike Brown
CTO, comScore
Paul Speciale
VP, Products, Amplidata