Dark Matter in the Enterprise Universe: The “Dark Data” Opportunity

March 5, 2013

Big Data

Big Data (Photo credit: Kevin Krejci)

The “Big Data Tsunami” was the theme of my last post. Today I want to share with you another angle on how to look at Big Data. This angle reflects more the way physicists look at dark matter in the universe: It is there, we can calculate the mass but it eludes our possibilities of observation.

Dark Data is the untapped mass of under- (or un-) utilized data whose existence is widely unknown or unrecognized in business of all trades. But these dark data might contain valuable information if we were only able to tap it.

The problem lies in the fact that most of this Dark Data is present in unstructured (free text descriptions, free text observations, notes, etc.) or non-textual formats (pictures, videos, audio files and more).

Infosys has recently announced their BigDataEdge platform that will radically simplify the task of analyzing Big Data. They published an infographic that nicely explains the issue (see below). Many companies are developing similar systems that will allow businesses to gain valuable information from all these data, which are today hidden in the closet.

The challenge is in how to formalize unstructured data. The Infosys approach includes (quote from the announcement):

  • A rich visual interface, with more than 50 customizable dashboards and 250 built-in algorithms. These algorithms, a set of reusable business rules both function and industry-specific, enable business teams to self-serve the process of building insights while minimizing the need for technical intervention
  • Over 50 data source connectors, which allow easy access to structured and unstructured data residing across enterprise and external sources. This would enable acceleration of discovery of relevant information from existing, underutilized data
  • A powerful collaboration wall and pre-built workflows that allow teams across functions to interact on insights and collectively implement decisions
  • A Logical Data Warehouse providing a virtual data management architecture, eliminates the need for physical availability of data to build and test insights
  • ‘Out-of-the-box’ applications for specific industry needs such as fraud detection and prevention, predictive analytics and monitoring, and customer micro-segmentation that deliver faster returns on investment 

(end quote)

But how do we make all these data accessible? One approach is to have all the data in a virtual data center, a. k. .a Cloud. This way system to system interfaces are not coming into the way of data aggregation.

Just marvel with me what this could mean for research in mining existing research data, observations, notes in lab journals from all the experiments that were filed away since the results had not corroborated the original hypothesis. Would you agree that undetected gold nuggets are still buried  in the mud of unstructured information? Imagine that all this information could be tested against new hypotheses, could be checked for weak correlations and connections undetected before.

We are only at the beginning of a new development here. More interesting inventions and innovations lie ahead of us.

Where do you see interesting new developments coming in this space?

Infographic on Dark Data

Infosys’ depiction of Dark Data (link to the original web site–click to enlarge)


Data Are Growing Up: The Big Data Tsunami

January 29, 2013

Visualization of all editing activity by user ...

Visualization of all editing activity by user “Pearle” on Wikipedia (Pearle is a robot). To find out more about this project, go to: http://www.research.ibm.com/visual/projects/chromogram.html (Photo credit: Wikipedia)

When our information systems started to grow and a few tables were not able to hold the (mostly numerical) information we started to build databases.

When databases became too small to contain all the information we were dealing with or the data were distributed in different (not so compatible) data stores we invented the data warehouse.

But to fit into these warehouses data had to be structured. Life though is different and today we are dealing with tons of poorly structured and unstructured data. So here is the latest trend: Big Data.

The industry successfully termed this as a new buzz word and — as always when a new buzz word hits the market — definitions of the term are different be company, speaker and region (here in Switzerland by canton).

Raj Sabhlok wrote in Forbes: “For example, most organizations have their data in structured relational databases like Oracle, but much of the data generated today is unstructured, high-volume web data or machine data. Technologies like Hadoop and “NoSQL” databases, such as Cassandra and MongoDB, are better designed to support massive data processing and storage. Emerging technologies such asStorm and Kafka are designed to provide real-time streaming analytics, which is critical for volume data feeds such as social networks. Even ad-hoc query tools such as Dremel have been introduced to support Big Data environments with low latency.

“Big Data also brings new skill-set challenges. As companies look to answer the most relevant questions related to their businesses, they will need data analysts or “data scientists” to mine the data. And they should get started soon; according to a recent McKinsey study, the United States alone faces a shortage of up to 190,000 workers with analytical expertise, as well as another 1.5 million managers and analysts that have the skills to understand and make decisions based on Big Data analysis.

“The Big Data movement is the recognition that there’s “gold in them there data stores!” There are tons of real-world examples of Big Data done right — just ask President Obama. However, it’s not something to dive into without first doing some serious soul-searching about your company’s goals. And it’s definitely crucial to have the right tools to support your unique corporate needs. But as professor Clemen always used to ask, “What would you pay for perfect information?”

Dilbert on Big Data

Dilbert on Big Data (image at dilbert.com)

One of the newer methods to introduce new terms and to explain novel concepts has been the use of infographics. You can find several such examples in this blog when you enter “infographic” as a search term in the rightmost column.

Infosys just published one of those infographics on Big Data in the enterprise. I like this graphic since it understandably explains the concepts behind the buzz word  (click to enlarge).

Big Data Infographic by Infosys

Infographically speaking: big data in 2013– By Rajeev Nayar (click to enlarge)

Another useful infographic on Big Data was recently published by Muhammed Saleem: Big Data and the future of our health. He maintains that medical diagnoses, general patient care, and medical practices are often more expensive and inferior than they should be. Big Data could revolutionize healthcare by replacing up to 80% of what doctors do while still maintaining over 91% accuracy. The graphic is displayed at http://www.insurancequotes.org/2013/01/15/big-data-and-the-future-of-healthcare/  (click to enlarge).

Big Data in Healthcare

Big Data in Healthcare. From the page mentioned above (click to enlarge).

The importance of Big Data analysis has recently been reported in the context of President Obama’s re-election. Crovitz wrote on Nov. 19, 2012 in the Wall Street Journal:

When the Obama campaign emailed supporters to join a $40,000-a-ticket dinner in June at the New York home of actress Sarah Jessica Parker, journalists at ProPublica noticed something odd. They uncovered seven versions of the email solicitation for the fundraiser, some mentioning a second fundraiser that night, a concert by Mariah Carey, others that Ms. Parker is a mother, and still others that Vogue editor Anna Wintour would be at the dinner.

Who got which email depended on “big data”—information about each fundraising prospect and how different people react to different messages. In this year’s election, it looks as if the Obama team’s use of such data was one of its biggest edges over the Romney effort.

[ . . .]

The Obama campaign focused on data showing the “persuadability” of voters. Multivariate tests identified issues and positions that could move undecided voters, ProPublica said: “The persuasion scores allowed the campaign to focus its outreach efforts—and their volunteer calls—on voters who might actually change their minds as the result. It also guided them in what policy messages individual voters should hear.” (Read the full article here)

Big data hold a so far untapped potential. Pharma companies will have to deal with a massive data deluge when comparing genome information of thousands of people to find patterns that correlate to certain diseases and give clues on possible medications.

Are we becoming more transparent? You bet. But we have to learn to mask data in a way that it ceases to be personally identifiable information (PII). See my blog on “You Have Zero Privacy Anyway — Get Over It” (Really?)

Do you want to share some insight or other infographic on the subject? What is your take on Big Data?


Gravity-Powered LED Lamp Revolutionizes Self-Sufficient Lighting

December 14, 2012

How a  gravity-powered LED could revolutionize cheap lighting | SmartPlanet

For approximately US$ 5.– GravityLight allows to shed light into any dwelling. Picture from the article

This is a very interesting new development. Since the arrival of affordable LEDs and their increasingly better efficiency and light quality new mechanisms can be used to power these new lights in any dwelling.

British designers Martin Riddiford and Jim Reeves of GravityLight use just this–gravity with a sand filled bag or other wights to power the lamp. The lamp also serves as a power station to power radios or other low-wattage devices.

See for yourself and watch the video at SmartPlanet: How a $5 gravity-powered LED could revolutionize cheap lighting | SmartPlanet.


Chicago Debuts Smog-Eating Street | SmartPlanet

October 22, 2012

Sarah Korones reports on an interesting story in SmartPlanetChicago debuts smog-eating street (21-OCT-2012).

Drawing symbolizing Green City

Image at ChicagoRealEstate.com

From the article: “In an effort to clean up its city streets, the Chicago Department of Transportation (CDOT) has set out to create the “greenest street in America” and short of closing off the road all together and turning it into a park, they seem to have done just that.

“Officials from the department have completely transformed a two-mile stretch of Cermak Road and Blue Island Avenue in the city’s Pilsen neighborhood, an industrial section of the city that is frequented by trucks passing through.

“To start, the street’s pavement has been replaced with a new variety that actually cleans the surface of the road while removing pollution from the surrounding air.  Photocatalytic cement removes nitrogen oxide gases from the air through a catalytic reaction driven by UV light, according to CDOT. In addition, a slew of recycled materials were blended into both the street and sidewalk’s pavement.”

Read the full article here.

Related articles:


Hackers join McAfee to combat electric vehicle viruses | SmartPlanet

August 24, 2012

 

SmartPlanet just published an interesting article that I would like to share with my readers: Hackers join McAfee to combat electric vehicle viruses | SmartPlanet.

Electric vehicles are on Wifi — and in danger. Picture as it appears in the article

Here the first paragraphs of the article:

“A team of hackers working for security company McAfee is one of a small number of firms considering the ways to protect electric vehicles (EVs) from security threats, Reuters reports.

“Automakers may be jumping at the chance to fit cars with a number of gizmos aimed at enticing consumers, including wireless connections and dashboard apps, but as these vehicles use the same wireless technology that mobile devices and personal computers use, they are also vulnerable to the same security flaws.

“The consequences of remote attacks have serious consequences. From theft to eavesdropping on conversations, if a car’s security is compromised, it could also confuse navigation systems and potentially cause accidents.”

Read the full article here.

 


The Periodic Table Of The Social Web

July 24, 2012

Having studied Chemistry in university for six years, I remember well the Periodic Table of Elements from the lecturs in Inorganic Chemistry.

With some amusement I recently came across a new “Periodic Table” of the Social Web: The Periodic Table Of The Social Web | The Favo.rs Blog:

The Periodic Table Of The Social Web

Created By Favo.rs

Don’t be alarmed if you don’t know some of the “elements” in Social Media. Unlike their chemical counterparts they come and go . . .

The relevance of all this?  See the video below.


Five Tech Trends Impacting Business Innovation in 2012

June 25, 2012

Blood pressure monitor with iPhone app

Blood pressure monitor with iPhone app. Image at the article quoted. All rights with the original publisher

Tim Sweeney wrote in January a blog on Innovation Excellence on this year’s tech trends.

I only recently stumbled on his article and think it is worth sharing.

I want to especially point to his report on apps and technologies that lets users monitor and manage their health.

He writes in Innovation Excellence | Five Tech Trends Impacting Business Innovation in 2012:

“Novel apps and devices will increasingly let consumers discreetly manage their health more productively. Self analysis tools have just begun to trickle into the market with technology like Fitbit and JawboneUP. Research company Technavio predicts that the global mobile health applications market will reach USD 4.1 billion by 2014, up from USD 1.7 billion in 2010.

“You’ll see solutions for diagnosing, monitoring and treating a variety of illnesses – from obesity to asthma, from poor vision or hearing to high blood pressure. Seemingly disparate data points, work activity, commute, financial and calendar data will be compared to health behaviors to achieve new understanding of ones self. This data tracking will create new benefits for the individual. It will also intensify the data concerns and scrutiny if online and cloud services that support the system of personal data storage.

“Need further proof? Apple’s App Store currently offers 9,000 mobile health apps (1,500 cardio apps, 1,300 diet apps, 1,000 stress and relaxation apps, and 650 women’s health apps). By mid-2012, this number is expected reach 13,000 (Source: MobiHealthNews, September 2011).

“Collecting, sharing, tracking and optimization of oneself is a major trend for 2012. Look for this trend to extend into other sectors throughout the year.”

The article continues to list some of the gadgets out there in more detail.

I consider this a very interesting development that is worth to follow. Are you seeing similar trends in technology use for health care?