The Discovery Lab: the pearls and perils

A recent European study on University-Business Cooperation (UBC) suggested that joint research can have major benefits. Among those are “innovation with a longer-term horizon, shorter-term problem solving, data for high quality research and the ability to bring research into practice creating impact”. Max Welling explored the necessity of the role of corporates in AI ecosystems in his blog. Local innovation ecosystems – such as Amsterdam Data Science (ADS) – can act as catalysts for regional competitiveness. Campuses can act as “platforms or hubs” – as with the Innovation Centre for AI (ICAI) Labs, and these in turn drive societal and economic impact for governments. These hubs are often the result of collaboration between academia and commercial businesses. The benefits are great, but getting such collaborations off the ground can be tricky – so what are the secrets to a fruitful collaboration? The Discovery Lab, established a year ago, is a collaboration between Elsevier, the University of Amsterdam and the VU Amsterdam, with support from ADS. The lab is the latest step in a collaboration that started about 10 years ago, inspired by a joint vision of using AI and knowledge graphs to make research easier.

Trust is the Key Element to any Successful Partnership

The creation of the Discovery Lab was an exercise in trust, not just building trust between collaborators but also using that trust to develop an idea into something tangible. Although it may sound trivial, personal trust is essential, starting with the initial “click” that gets the partners excited about an idea. Collaboration between organisations is ultimately a collaboration between individuals.

Success Factors to Launch the Discovery Lab

Looking back on our experiences with ADS and ICAI, three success factors were dominant:

1. Building credibility and showing reliability

Getting to know each other’s capabilities over time helps clarify which partners are well suited to which projects. Before big investments can be made from either side, we used joint funding applications, master student projects and smaller contract research work. Although this takes longer, it is a strong foundation. Together with ADS, we helped shape an ecosystem around this work that allowed all partners to explore ideas with each other and shape the joint excitement – before making bigger investments. Within the Discovery Lab we consolidated and connected various joint projects around research knowledge graphs.

2. Legal is necessary, but should not get in the way

The most frequent complaint I hear across collaboration set-ups is about legal matters delaying and complicating a partnership. Involving legal early in the process – as people often suggest – can actually complicate the partnership at the wrong time. If they’re involved too late it can delay the process at the end. So, how do you find the right balance and how do you actually benefit from the legal queries raised in building trust? In my experience, what holds the process back is that business priorities are not sufficiently detailed and clear enough to state what is key to the business/research and what is optional. For example, in IP what risk is the business willing to take and is the business able to actually absorb the results of the research? on the university side: what commitments, e.g. in terms of liabilities or in terms of AI ethics, can a university and a researcher make? And there are differences in style, e.g. what level of detail needs to be agreed upon and what can be left to later negotiations. The challenge is how to formalize working with uncertainty and whether you’re open to defining the approach, the process, the principles of work and the resources rather than the outcomes. The more obscure outcomes and processes are, the more likely that legal issues appear.

3. Challenges with an evolving research landscape

The challenges to establishing the Discovery Lab were sometimes unexpected: too little transparency of government decision criteria being the main issue we faced. The key to overcoming these challenges was to always have a fast and flexible response in collaboration with our partners. Different methods of doing so may need to be tested, but with a strong foundation of trust between our collaborators meant we were able to face these challenges head on.

The Discovery Lab one year on

The Discovery Lab has been active for a year, and our joint efforts have really paid off. Despite the difficult circumstances due to COVID-19 adversity, there is active collaboration between the university researchers and Elsevier’s data scientists in the Lab. The first joint experiments are being conducted, the first publications are already out, and we’re actively connecting research in the Lab with business activities. We hope that our experiences in building the Discovery lab will help others to build similarly successful collaborations within the ADS ecosystem.

Europe needs its own Google!

The pros and cons of the existing Big Tech elite

The US and Chinese Big Tech elite deliver us many benefits. Not only have they been a driver of innovation, but also a creator of thousands of jobs. The technology that has come out of these companies has enabled us to stay in touch with our family and friends, access all kinds of information or media and even automate large parts of our lives (think online shopping and payments). These services are often offered to us under the guise of being “free”, but is it really free? No, of course not. How could any company employing hundreds of thousands of people survive by delivering free services? For many of these services the unknown cost is our data. With every click, swipe and scroll, information is collected which often leads to adverts and other suggestions from the platform. But what is wrong with this? The following three arguments stand out in my opinion.

1. Consumers ultimately lose

Consumers pay too much for goods and services. To demonstrate how, imagine a relatively small retailer R uses Big Tech platform P to sell its goods. If P detects that R is doing well, it could start to compete with R by selling similar goods or services at a lower price. This could even be below cost price, because P has “deep pockets”, much deeper than R. At some point in time it becomes unprofitable for R to sell via P and R leaves the platform. Then, P takes over and increases its prices. This is called “predatory pricing”. Lina Khan wrote a well-cited article about predatory pricing in relation to Amazon. Khan explains that in most cases, predatory pricing is forbidden, because in the long run consumers pay too much, due to lack of competition in the market.  Another example of the consumer ultimately losing can be illustrated in search results manipulation. In 2017, the European commission fined Google €2.4 billion, for placing their own shopping comparison service “Google Shopping” in a more prominent position in their search results compared to their competitors. This gives Google an unfair advantage in what consumers see. This opportunity occurs for Google due to “vertical integration”. Both, “Google Search” and the platform “Google Shopping” are owned and controlled by one company. The European Commission argued that consumers are harmed by this way of using vertical integration and fined Google.

2. The invasion of our privacy

We assume our data is private, but when you look at the terms and conditions of the Big Tech platforms, this is often not the case. Our data is sold to third-parties or used by a different division of the Big Tech company where we have no control over how it is used. For example, your fitness tracker data could be shared with the financial services division of the same company. This could lead to you being denied access to certain health insurance, however you were unaware of the use of your data for this purpose. There is also the infamous case of Cambridge Analytica, where private Facebook data was used for political gain.  Another privacy threat that is not so much the fault of the US or Chinese Big Tech elite, but more the national security laws in their home countries. These enable governments to obtain access to all the data a company stores and processes. Governments could use these laws for all kind of good purposes, e.g. to prevent terrorism, but also for bad purposes, e.g. to restrict “freedom of speech” or protests or (corporate) espionage. An example is the US Patriot Act, while a similar law applies in China.

3. The competitive position of Europe

Big Tech’s dominance hurts the European (knowledge) economy and production. Many countries in Europe want a knowledge and technology based economy, but find their top talent is recruited by US and – to a lesser but growing extent – Chinese Big Tech companies. This is something we experience in the Amsterdam region. By drawing talent away, the Big Tech elite further extends its technological superiority and erodes Europe’s power to innovate. In the end, this is very costly to European production, GDP and the labour market. Furthermore, our economy becomes increasingly dependent on the US and China.

How can we protect ourselves?

There are many ways to fight against the three issues above. On the consumer level, this includes better education about the dangers of “free” services, and naming and shaming offending companies. On a regulatory level, we can design better laws, improve law enforcement, fine or split up companies, for instance separating the infrastructure (e.g. the Google browser) from the other services (e.g. Google Search). We could even separate the services from each other (separate Google Search from Google Shopping). However, these actions are primarily defensive and I have reservations whether they alone can deal with these issues satisfactorily. Furthermore, many of these actions will be very difficult and time-consuming to implement, while there are also strong doubts about their applicability and effectiveness. Some of them might even have the opposite effect and decrease competition. For these reasons, I am a strong supporter of combining these defensive actions with a more offensive approach: building our own European alternatives. GAIA-X might be a first step for Cloud Storage, but we also need other applications, such as our own AI apps, similar to the suites that Amazon, Microsoft and Google offer. The big question is of course: how do we create this from scratch? We will face obvious challenges: who and how much should invest in such a project during this difficult economic period in Europe. Do we have the right know-how to develop high quality products and services, and will we really be able to compete with the current Big Tech elite? Although many doubt our ability, there is precedent for such collaboration. Airbus is a prime example of a successful project with multiple European countries and companies involved. Time is ticking as we deliberate over these challenges. We need our own Google, before it is too late….!

AI Research with China: to Collaborate or not to Collaborate – is that the Question?

The opinions in this blog item are the author’s own and do not necessarily reflect those of the organisations she represents. Amsterdam has a historically strong connection with Chinese culture, housing one of the oldest Chinatowns in the Netherlands. While our perception of Chinese culture is perhaps based predominantly on its cuisine, we have to reassess any biases of the past and understand the dynamic, creative and innovative world of AI in China today.

Shifting our perspective of China

Our image of China in Europe comes predominantly through the eyes of Western media [1]. This mixes images of peasant farmers working with technologies of the past, vast modern cities with millions of citizens, and, more recently, the threat of 5G technology being used to infiltrate our state security systems. This results in a biased perception when we are confronted with issues in our own field. China takes a long-term view and this can be seen in its investments in AI research innovation, and particularly its tech talent. Huge efforts have been made to attract successful AI researchers back to their home country to carry out internationally competitive research and to educate new generations of talent. Furthermore, China’s presence in the international AI research community is growing. This can be seen by the increasing percentage of papers in the top international AI conferences that are co-authored by Chinese colleagues, working from China or from abroad [2].

Cultural differences in AI applications

While better, and more, researchers across the globe is generally good news for academic research, in AI we need to remain cautious. China’s enormous investments in AI have led to domination in a narrow set of sub-fields around machine learning, with an emphasis on computer vision and language recognition. This domination could be perceived as cause for concern from an international standpoint. For example, computer vision techniques can be developed for facial recognition to track the movements of citizens, different cultures perceive the benefits and dangers differently. Using these same techniques for other applications, such as recognising the differences between cancerous and benign cells is, however, universally perceived as “good”.

To collaborate or not to collaborate?

This brings us to the difficult political and scientific choices that need to be made as to when and how to collaborate with China, and when to politely decline. Do we need to completely halt all collaboration with Chinese academics and companies? In doing so, we would isolate our colleagues in China. Furthermore, cessation of collaboration would be counter to the established international research culture of openness and dialogue. It is common for European researchers to collaborate with large corporations based in the US. They fund research collaborations and attract high-profile staff to work with them. At the same time, they have created the data economy that led to the passing of EU law to give European citizens at least some control of the data that they (often unknowingly) give to these corporations. There is little discussion in academia, at least to my knowledge, as to whether we should think carefully about collaborations with these US-based companies. While it would be nice to have concrete national guidelines, for example those developed by Frank Bekkers and colleagues [3], or have every AI academic take a course in ethics before signing a contract with a large corporation, this is unrealistic. That said, when working with any large corporation, be they US or China-based, it is essential to retain academic freedom to choose with whom we work and on what research topics.

We cannot not collaborate with the Chinese

So what are my recommendations in this complex and sometimes contradictory collaboration puzzle? China is a world-leader in AI research, technology and innovation. As investment into this field continues to grow this will only become more pertinent. We therefore cannot ignore the relevance of China in our own research and development but we can be considered in our approach to collaborations and make informed decisions on a case-by-case basis. Within the Amsterdam Data Science academic network we have a number of connections with Chinese universities and research institutions, such as the Chinese Academy of Sciences Institute of Automation, Tsinghua University and the Wuhan University of Science & Technology. Alongside my role as director of Amsterdam Data Science, I am the European Director of LIAMA, an organisation for stimulating research collaboration in maths and computer science between CWI, Inria and the Chinese Academy of Sciences. Our goal – just as any international research collaboration – is to stimulate creative and innovative research through the mix of local research cultures. If you would like to collaborate with a Chinese research lab or company then reach out.  Make friends with a Chinese colleague and learn about their culture. Watch some of the Ruben Terlou documentaries. Read the “AI Superpowers” book by Kai Fu Lee, which gives insights into taking the Silicon Valley start-up culture and transferring it to China, while at the same time metamorphosing it to the rules of a new “Wild East”. Learn Chinese and (when we can all travel again) visit your colleagues in China. In the 17th century, Amsterdam was one of the few harbours of religious freedom in the world. Let us continue this tradition by welcoming researchers from other cultures and, through collaboration, understand more about the cultures they come from.   References
  1. A refreshing change, for those who understand Dutch, are the VPRO series about China by Ruben Terlou
  2. Elsevier, 2018. ‘ArtificiaI Intelligence: How knowledge is created, transferred, and used. Trends in China, Europe, and the United States‘ 
  3. Frank Bekkers, Willem Oosterveld, Paul Verhagen Checklist for Collaboration with Chinese Universities and Other Research Institutions“, The Hague Centre for Strategic Studies, January & September 2019
Reading list

Collaborative data analysis using SWISH DataLab

SWISH unites SWI-Prolog and R together behind a web based IDE that resembles Jupyter notebooks. The platform allows multiple data scientists to work on the same data simultaneously while rule sets can be reused and shared between users. This facilitates data scientists to provide more complex data transformation steps to domain experts.  Most pipelines use a general purpose programming language such as Python to clean and ingest the data into a linked data store or RDBMS. The relevant data is then selected and appropriate machine learning is applied. In contrast, SWISH data management is based on Prolog, a relational and logic based language. External data sources, such as RDBMS, Linked Data, CSV files, XML files and JSON, are made available using a mixture of adaptors, which make the data available in Prolog’s relational model without transferring the data, and ingestion, which loads the data into Prolog. Allowing the data to be used in a unified framework without transferring this data simplifies bringing the data together. Subsequently, declarative rules define a clean and coherent view on the data that is targeted towards analysing this data. Given the logic basis of Prolog, this view is modular, concise and declarative, making it easy to maintain. SWI-Prolog’s tabling extension provides the same termination properties as DataLog as well as the same order independence of rules within the subset Prolog shares with DataLog. Tabling also provides caching of results. At the same time, users have access to the more general Prolog language to code transformations that are not supported by DataLog. According to Wikipedia, “In recent years, Datalog has found new application in data integration, information extraction, …”.  SWISH adds collaboration as well as Turing completeness to deal with transformation that Datalog is not capable of in a coherent environment.
  • The SWISH DataLab can be configured to allow both authenticated users and anonymous users with limited access rights. 
  • Notebooks and programs are stored in a GIT-like repository and fully versioned. 
  • Results can be reproduced reliably through creating a snapshot of a query and all relevant programs.
  • Data views defined in SWISH may be downloaded as CSV and can be accessed through a web based API.
  • The platform can be deployed on your laptop as well as on a server. 
The SWISH DataLab provides a high-level platform to select and combine data sources in multiple workflows, while using tools that are in common usage by data analysis professionals. Everything you need to get started with the SWISH Datalab is available as open source software:

AI Technology for People

Amsterdam Data Science (ADS) has created a tremendous network in which academia, companies, and the municipality come together in meetups and other events. Furthermore, its newsletter reports news from the community, and ADS seed projects encourage collaboration between research and companies. These are powerful means to build an ecosystem around data science, a field in which various disciplines have to come together to be successful and where fundamental research and application in real world settings go hand-in-hand.

AI affecting the world in which we live

Recently ADS has added AI to their agenda and rightly so – AI is transforming the world at a very rapid pace. It builds upon the foundations laid out in big data research and data science. It introduces intelligence in the form of being able to really understand unstructured data such as text, images, sound, and video, and techniques for giving machines a form of autonomous behaviour. Over the last months the knowledge institutes in Amsterdam (AUMC, CWI, HvA, NKI, UvA, VU), in conjunction with Sanquin, the Amsterdam Economic Board and the City of Amsterdam, have formed a coalition and developed a joint proposition for AI in Amsterdam: “AI Technology for People” building upon three foundational themes.
  • Machine learning has been the main driver in the emergence of AI – and will continue to push it forward. Techniques include data-driven deep learning methods for computer vision, text analysis and search approaches that make large datasets accessible and knowledge representation and reasoning techniques to work with more human-interpretable symbolic information. Related activities include the analysis of complex organizational processes, and knowledge representation and reasoning techniques to work with symbolic information.
  • Responsible AI is key to assuring that technology is fair, accountable and transparent (FAT). Methods need to prevent undesirable bias and all outcomes should be explainable through the identification of comprehensible parameters on which decisions are based. When high-impact decisions are involved, the reasoning behind them must be understandable to allow for ethical considerations and professional judgements.
  • Hybrid intelligence combines the best of both of these worlds. It builds on the superiority of AI technology in many pattern recognition and machine learning tasks and combines it with the strengths of humans to deploy general knowledge, common sense reasoning and human capabilities such as collaboration, adaptivity, responsibility and explainability. Hereby combining human and machine intelligence to expand on human intellect rather than replace it. See the recent blog by Frank van Harmelen on the Hybrid Intelligence project.

The focus of AI Technology for People

The coalition focuses on three application domains.
  • AI for business innovation: As described in Max Welling’s blog, research excellence has already inspired several international partners to start research labs in Amsterdam within the Innovation Center for Artificial Intelligence (ICAI). Other companies, both regional and inter)national, continue to follow suit. Amsterdam hosts the headquarters of major companies that rely on AI to innovate, many small- and medium-sized high-tech AI businesses and a strong creative industry. The city provides an ideal ecosystem in which business innovations – both small and large – can flourish.
  • AI for citizens: With its multitude of cultures, large numbers of tourists, rich history, criminal element and intense housing market, Amsterdam has all the challenges and opportunities of other major world cities, but in a far smaller area. With the excellent availability of open data in the city, AI can be applied directly to improve the well-being of citizens – with the city itself becoming a living lab.
  • AI for health: The coalition is building on the work of renowned medical research organisations such as Amsterdam UMC, NKI, Sanquin and the Netherlands Institute for Neuroscience. The cross-sectoral health-AI collaboration has also been institutionalized in other ways, such as through ecosystem mapping and Amsterdam Medical Data Science meet-ups, with all initiatives being bundled under Smart Health Amsterdam.

Achieving its Goals

To realize the above ambitions, the AI coalition partners not only plan to make their own major investments in AI, they also aim to attract significant external funding, for example through labs within the Innovation Center for Artificial Intelligence (ICAI) and other funding instruments through the National AI Coalition (NL AIC). Ecosystems in which science, policy, industry, and society (the quadruple helix) come together are the basis for these national initiatives. Amsterdam and the ADS ecosystem provide a successful regional example of collaborations in many forms, such as industry funded PhD students, joint appointments, professionally oriented education and partner meetings. ICAI, which has its headquarters in Amsterdam and labs all over the Netherlands, is a member of ADS and creates industry funded research labs that produce research presented at top international academic conferences. Let’s use the momentum to bring the ecosystem to the next level.