What to bin and what to keep; the big data conundrum

Figuring out what is valuable data and binning the rest has been a challenge for the telco industry, but here’s an interesting dilemma; how do you know the unknown value of data for the usecases of tomorrow?

This was broadly one of the topics of conversation at Light Reading’s Software Defined Operations & the Autonomous Network event in London. Everyone in the industry knows that data is going to be a big thing, but the influx of questions are almost overwhelming as the number of data sets available.

“90% of the data we collect is useless 90% of the time,” said Tom Griffin of Sevone.

This opens the floodgates of questions. Why do you want to collect certain data sets? How frequently are you going to collect the data? Where is it going to be stored? What are the regulatory requirements? How in-depth does the data need to be for the desired use? What do you do with the redundant data? Will it still be redundant in the future? What is the consequence of binning data which might become valuable in the future? How long do you keep information for with the hope it will one day become useful?

For all the promise of data analytics and artificial intelligence in the industry, the telcos have barely stepped off the starting block. For Griffin and John Clowers of Cisco, identifying the specific usecase is key. While this might sound very obvious, it’s amazing how many people are still floundering, but once this has been identified machine learning and artificial intelligence become critically important.

As Clowers pointed out, with ML and AI data can be analysed in near real-time as it is collected, assigned to the right storage environment (public, private or traditional dependent on regulatory requirements) and then onto the right data lakes or ponds (dependent on the purpose for collecting the data in the first place). With the right algorithms in place, the process of classifying and storing information can be automated, freeing up the time of the engineers to add value, though it also keeps an eye on costs. With the sheer volume of information being collected increasing very quickly, storage costs could rise rapidly.

And this is below the 5G and IoT trends have really kicked in. If telcos are struggling with the data demands of today, how are they going to cope with the tsunami of information which is almost guaranteed in tomorrow’s digital economy.

Which brings us back to the original point. If you have to be selective with the information which you keep, how do you know what information will be valuable for the usecases of tomorrow? And what will be the cost of not having this data?