Category Archives: Technical

How big data and new technologies such as SAP HANA changes oil and gas industry – Part 1

In previous blog “Big Data: a mysterious giant IT buzzword”, I refer to Gartner’s definition of big data which covers 5 V’s with focus on some specific characteristics of the upstream data;

  • Volume – Seismic data acquisition
  • Velocity – Real-time streaming data from well-heads, drilling equipment and sensors
  • Variety
    • Structured: standard and data models such PPDM, SEG-Y, WITSML, PRODML, RESML, etc.
    • Unstructured: images, log curves, well log, maps, audio, video, etc.
    • Semi-structured: processed data such analysis, interpretations, daily drilling reports, etc.
  • Veracity (Data Management practice to provide accurate and good quality data)
    • Pre-processing to identify data anomalies
    • Run integrated asset models
    • Combination of seismic, drilling and production data
  • Value
    • Faster decision and enhancing production
    • Reduce costs, such as Non Productive Time (NPT)
    • Reduce risks in the areas of Health, Safety and Environment
    • Forecast and planning using predictive analytics

The oil and gas industries generate significant data volume through exploration, development and producing hydrocarbons. The Oil and gas industry conducts advanced geophysics modeling and simulation where 2D, 3D & 4D Seismic generate significant data during exploration phases. Thanks to new technologies, we’re able to gather, integrate and interpret data received from thousands of data-collecting sensors to track any activity happening almost real-time or near real-time (NRT). It means structured, semi-structured and unstructured dataset is growing daily.

The oil and gas industry started to recognize the importance of getting access to accurate data faster to make decision quicker. So far most of the analysis has been done the same way it was historically used within technical disciplines and a relatively small geographical study area. Now, we observe huge potential using in-memory technologies such as SAP HANA and big data to learn much more from the data. We need access to the appropriate technology, tools, and expertise to integrate and synthesize diverse data sources into more manageable format and derive insight from these datasets. With big data analytic solutions, we’re able to manage and control the data volume, the complexity of the data and break the barriers of geography and disciplines to see the big picture. Currently there are handful of companies have adopted big data such as Chevron and Shell, however the future looks promising and we’re expecting a big demand for big data, in-memory technology and analytics solutions. Let’s say it will happen eventually!!!

Big Data: A mysterious giant IT buzzword

In the world of technology there are a hundred definitions for “Big Data,” it seems confusing to come up with a single definition when there is a lack of standard definition. Like many other terms in technology, Big Data has been evolved and matured and so has its definition. It certainly depends on who we ask and what industry/business field, we will get different definitions. Timo Elliott summarized some of the more popular definitions of Big Data in “7 Definitions of Big Data You Should Know About”.

You may be familiar with three “V’s” or the classic 3V model. However this original definition does not fully describe the benefits of Big Data. Recently, it has been suggested to add 2 more V’s to the list such as Value and Verification or Veracity which are resulted from “Data Management Practices.” As a BI expert who is been involved in Big Data, my approach is to have a practical definition for my clients by emphasizing on main characteristics of data and purpose of Big Data related to each specific area. I like Gartner’s definition which is not too long. Gartner defined Volume, Velocity and Variety characteristics of information assets as not 3 parts but one part of Big Data definition.

Big data is high-volume, -velocity and -variety information assets that demand cost-effective, innovative forms of information processing for enhanced insight and decision making. (Gartner’s definition of big data)

The second part of the definition addresses the challenges we face to take the best of infrastructure and technology capabilities. Usually these types of solutions are expensive and clients expect to have cost effective and appropriate solution to answer their requirement. In my opinion this covers the other V which is related to how we implement Data Management Practices in Big Data Architecture Framework and its Lifecycle Model.

The third part covers the most important part and ultimate goal which is Value. Business value is in the insight to their data and to react upon the insight to make better decisions. To have a right vision, it’s important to understand, identify and formulate business problems and objectives knowing practical Big Data solutions are feasible but not easy. So when I define Big Data for my clients, I use Gartner’s definition and explain the journey we need to take together to achieve their goal.

In any Big Data project, I start with BDAF or Big Data Architecture Framework which consists of Data Models, Data Lifecycle, Infrastructure, Analytic tools, Application, Management Operation and Security. One of the key components is having high performance computing storage. Since Big Data technologies are evolving and more options to be considered, I’m focusing on SAP HANA capabilities which enable us to design practical and more cost effective solutions. HANA could be one part of overall Big Data Architecture Framework but it’s the most essential part. The beauty behind SAP HANA is that it is not just a powerhouse Database but it is a development platform to provide real time platform for both analytics and the transactional systems. It enables us to move beyond traditional data warehousing and spending significant time on data extraction and loading. In addition we’re able to take advantage of hybrid processing to design more advance modeling. Another big advantage of HANA is the capability of integrate it with SAP and non-SAP tools.

So, why am I so excited about it? Looking around I see tons of opportunities and brilliant ideas which could get off the ground by some funding. So far, HANA has been more successful in large enterprises with big budgets and larger IT staff. However I’m also interested to encourage medium size enterprises to see the potential of HANA to provide a solution for their problems. The majority of businesses don’t spend their budget to develop a solution. They are eager to pay to solve a particular problem. Now, our challenges as SAP consultants are helping businesses to see the potential and how HANA will address their challenges. The good news is SAP supports by providing test environment and development licenses for promising startups.

Got your attention? Well, just to give you a glimpse, take a look at some of the success stories. In addition there are many many other cases if we look around. For instance, these days many applications capture Geo-location data like trucking company, transportation, etc. it means capturing data every 10 seconds or so from every section, every piece of equipment, every location. This could add up to a Petabyte of data! This is an excellent way to bring insight into data and drive intelligence out of it and have it circulated back to scheduling and movement processes. Another example could be companies needing to mine information from social media regarding to their products and connecting this intelligence back to their back end processes to increase customer engagement and satisfaction.

So, do you have any Big Data Challenge? With some funding, we’re able to provide cost effective and practical solution for your challenge to add value to your business.