Exploring the Future of Data and AI
Our CTO, Sabri Skhiri, recently travelled to Sorrento for IEEE Big Data 2023, a notable yearly gathering where experts delve into the intricacies of data science and AI. In this article, Sabri explores for you the various keynotes and talks that took place during the conference, highlighting the noteworthy insights and the practical applications shared by industry leaders:
"It's always a delight to participate in this conference; discussions about data science and scalability extend from breakfast through dinner! I had engaging conversations on various subjects, such as graph mining, knowledge graphs, autoML, stream processing, privacy, drug discovery, and fraud detection.
Before I dive in, I want to note that this article won’t delve into technical specifics. However, feel free to consult my detailed blog post on our research website, where I go deeper into the keynote speeches, tutorials, and tech discoveries.
Trends and Innovations in AI
Following last year's edition in Osaka (the first outside the US), the conference continues to embrace global diversity, reflected in its diverse Program Committee and the extensive range of papers presented.
It spotlighted key areas in AI - deep learning, machine learning, and their applications across various sectors like healthcare, mobility, and industry. Sustainability also emerged as a key theme, with discussions on frugal AI and green IT.
Of particular interest was the discussion on robust AI models. For businesses, this translates into developing AI systems that can reliably handle real-world data complexities, like noisy labels and concept drift.
The growing focus on ethical AI, including fairness and uncertainty estimation, reflects a broader industry move towards responsible AI practices, a crucial aspect for maintaining public trust and regulatory compliance. At Euranova, we actively engage in similar discussions, notably through our sustainable AI and responsible AI research track. For those interested in delving deeper into the topic of responsible AI, I recommend our recent discussion with UAntwerpen expert Toon Calders, hosted at Vlerick Business School.
Privacy in the Data Age
The conference addressed the challenges of maintaining privacy in the face of increasing data breaches, a crucial focus for businesses grappling with the dual challenges of leveraging data for growth while safeguarding user privacy.
For example, a keynote on data sharing in healthcare emphasised privacy challenges in the rapidly growing healthcare data sector. Understanding and legally sharing private healthcare data remains a complex challenge. It requires federating data across hospitals, especially for rare diseases, while adhering to legal and ethical guidelines. The talk critiqued traditional anonymisation methods (such as removing personally identifiable information (PII)), which are proving insufficient, as seen in incidents like the Massachusetts Governor's breach.
Such incidents demonstrate the vulnerabilities in current data protection practices. For businesses, understanding and implementing privacy-enhancing technologies (PETs) like Federated Learning and Secure Multi-Party Computation is no longer optional but essential. The potential of synthetic data, maintaining the statistical properties of original data, also offers businesses a way to utilise data insights while mitigating privacy risks. These technologies enable collaborative data usage while safeguarding sensitive information, a vital capability in today's data-driven world. Should you want to explore it, I delve deeper into this subject in my article on PETs.
Case Studies in Application:
An interesting case study from Yahoo demonstrated innovative audience prospecting methods for native advertising, highlighting the practical application of AI and big data in marketing.
While effective, traditional approaches like retargeting, search-prospecting, and location-prospecting often have limitations in reaching new potential customers. In today's rapidly evolving digital marketplace, businesses are constantly seeking innovative methods to reach a wider audience and engage them more effectively. Yahoo's presentation introduced two methods for Dynamic-Product-Ads (DPA) in native advertising within the Yahoo Gemini marketplace, addressing the challenge of expanding the audience for DPAs.
The first method, Conversion-Prospecting, uses logged data to predict DPA conversion rates and calculate the expected cost-per-action (CPA), optimising DPA bids in auctions to balance audience expansion with performance goals.
The second method, Trending-Prospecting, matches trending products with users based on their engagement on advertisers' sites, aiding in identifying popular products and user preferences, which is especially beneficial for new advertisers.
These methods have significantly improved DPA delivery and revenue while keeping CPA within target ranges and are now actively used in the Gemini marketplace. Should you want to dive into it, the paper offers a detailed overview of the Yahoo Gemini DPA system and introduces these effective prospecting strategies.
Conclusion: A Fusion of Knowledge and Innovation
The conference's quality, particularly in data science, has been impressive: IEEE Big Data remains a premier venue. Coupled with the incredible Italian cuisine, you can easily see why this was one of my favourite experiences of the year!
The conference showcased the transformative power of AI and big data across various sectors. From marketing to healthcare, the applications are vast and varied, offering businesses a window into how they can harness these technologies for their own growth and innovation. It also highlighted the need for robust, ethical, while addressing critical issues like data privacy. As we step into an era where data is omnipresent, we can pave the way for groundbreaking innovations and responsible AI development.