Case Study: Schiphol Airport CO2 Emission of Flights

Posted on

Authors: A. Singh & J.P.R. van der Laarse
Publication Date: December 4, 2021

For our final case study, we focused on predicting CO2 emissions from flights taking off and landing at Schiphol Airport, using AI. 

Why this is an interesting case 

Schiphol has promised to be CO2 neutral by 2030. This promise, however, only considers the airport itself, and not the actual flights. Think for example of the cars that drive along the taxiways. 

Parallel to this promise, we see an annual increase in flight movements that would largely cancel out the reduced CO2 emissions of the airport.

Inverse Surveillance AI 

If we have a look at our definition of Inverse Surveillance AI the case study should follow the following rules: 

(1) Citizens are surveilling governments and bigger organizations (2) in order to control and influence and (3) thus promote transparency and equality, and by doing so democratizing power.

In this case, the AI predicting CO2 emissions of flights will be used by citizens to surveil a bigger organization, Schiphol Group, in the context of their promise to reduce emissions. 

Flight Data 

For this case study, the data we need to surveil this larger organization is freely available, namely flight data. For this we can use the following: 

Via the developers portal from Schiphol airport, we can use their API and request all sorts of data, especially flight data: https://developer.schiphol.nl/login

We can also make use of this Schiphol Airport Flight Data API Python Wrapper on Github 

CO2 emission of flights

We can use the following links for making CO2 emission assumptions and calculations:

CO2 Emission of commercial aviation (per airplane type) https://theicct.org/sites/default/files/publications/CO2-commercial-aviation-oct2020.pdf 

CO2 Emission of aviation calculations by Carbonindependent
https://www.carbonindependent.org/22.html 

CO2 footprint calculation based on the distance between airports
https://github.com/acircleda/footprint 

Github links – CO2 Prediction

Base to start with
https://github.com/deoudit/Predicting-CO2-Emissions-using-Multiple-Linear-Regression

Different type of regressions:
https://github.com/pranavtumkur/Predicting-CO2-emission-using-ML-Regression-models

https://github.com/Strifee/co2_predict

https://github.com/tannyamishra/CO2-emission-prediction-in-cars-regression

Overview of time series forecasting
https://www.kaggle.com/vijaikm/co2-emission-forecast-with-python-seasonal-arima

https://machinelearningmastery.com/time-series-prediction-lstm-recurrent-neural-networks-python-keras/

Panopticon Effect 

Now that citizens can predict CO2 emissions using AI, and thus surveil Schiphol Group by doing so, this surveillance would not work if we do not include the panopticon effect. For this, we need to publicly share the output of this AI where citizens, activists, and journalists can freely see and export this information and use it against Schiphol Group in their protests, and (social) media. By doing so, we can hold Schiphol Group accountable, poke through their emission-free story by showing that the real problems are the actual flights and substantiate (political) arguments against Schiphol Group. By doing so, citizens will be empowered to stand up against the Schiphol Group and change the power dynamic more in their favor.

International Use 

Similar cases can be developed for other airports. Although for the panopticon effect we currently rely on a free society where protests are allowed and have impact on politicians and thus regulations.

Inverse Surveillance AI Expert Interviews

Posted on

In collaboration with podcast Future Based we conducted 4 expert interviews on the topic of Inverse Surveillance

Episode 1 – Aidan Lyon

“In this first podcast episode we will go into conversation with Aidan Lyon. Aidan completed his PhD in 2010 at the Australian National University on the philosophical foundations of probability. His research focuses on psychedelics, meditation, uncertainty, wisdom, and collective decision making. Aidan is also an entrepreneur: He is CEO and co-founder of DelphiCloud, and often works as a freelance consultant on projects relating to my research areas. His new book Psychedelic Experience, is a philosophical analysis of psychedelic experience with the central thesis that psychedelic experiences are mind-revealing experiences and can also occur via meditation.” source: Future Based Podcast Page


Episode 2 – Steve Mann

Steve Mann, an inventor and professor widely hailed as “the father of wearable computing” expanded on the concept at the MIT Media Lab. He is a professor studying priveillance, i.e. the interplay between privacy and the veillances such as surveillance (oversight) and sousveillance (undersight) as well as metaveillance (sensing sensors and sensing their capacity to sense). Steve has been described by the media as “the world’s first cyborg” for his invention of Mediated Reality (predecessor of Augmented Reality), and also invented HDR and panoramics now implemented in most cameras including Apple iPhone. He is considered by many to be the inventor of the WearComp (wearable computer) and WearCam (EyeTap and reality mediator). Furthermore, Steve joined Blueberry as Co-Founder and CTO in 2020. He is currently the acting director of the EyeTap Personal Imaging (ePI) Lab at the University of Toronto. He is also the Chief Technical Advisor of VisionerTech.

Steve has written more than 200 publications+books+patents, and his work and inventions have has been shown at the Smithsonian Institute, National Museum of American History, The Science Museum (Wellcome Wing, opening with Her Majesty The Queen June 2000), MoMA (New York), Stedelijk Museum (Amsterdam), Triennale di Milano, Austin Museum of Art, and San Francisco Art Institute.” source: Future Based Podcast Page


Episode 3 – Nadia Benaissa

Nadia Benaissa is a human right advocate and has worked as a data protection officer at the municipality, she is humanitarian, writer and policy advisor at Bits of Freedom. Bits of Freedom is an organization that stands up for two fundamental rights in your digital communication that are indispensable for your freedom: privacy and communication freedom. These rights have been built up over centuries in the offline world and because they are incredibly important for your individual freedom, for a just society and for a healthy functioning democracy, it is important to reflect on how online rights are being guaranteed. But how exactly is democracy, freedom and privacy being ensured online? In this episode, we talk with Nadia about the commonalities between AI and law and learning from historical data to improve the future.” source: Future Based Podcast Page


Episode 4 – Rudy van Belkom

Rudy van Belkom is a futures researcher at the Netherlands Study Centre for Technology Trends (STT). He recently published his book about ethics in the design process (‘AI no longer has a plug’) that offers developers, policymakers, philosophers and basically anyone with an interest in AI, tools for integrating ethics into the AI design process. The main question of his research is always: what future do we want? We need to ask ourselves what purpose we want to use technology for, rather than seeing it as purpose in itself. How can we use technology to create a better world? And what exactly is a better world? Currently Rudy is focusing on the impact of technology on the future of Democracy. In addition he developed an ethical design game for AI, inspired by the scrum process, that can be used to translate ethical issues into practice. The essence of the game is based on the position paper that he wrote together with the HU research group on AI and was accepted for ECAI 2020: ‘An Agile Framework for Trustworthy AI’. Van Belkom also investigated the role of AI in the future of his own field.” source: Future Based Podcast Page


Inverse Surveillance AI Hackathon 2021

Posted on

This hackathon is part of the Inverse Surveillance AI research project.

Hackathon Challenge: With your help we can demonstrate the potential of Inverse Surveillance AI —> using AI to surveil governments and bigger organizations to identify and predict wrongful behavior or systematic flaws and by doing so empower citizens.

Everyone is welcome to join. (Individuals & Teams)
This includes students, researchers, professionals, etc.

What to Expect

The hackathong consists of two parts:
1. One month of preparation time (starting October 15, 2021)
2. Hackathon Weekend (19-20-21 November, 2021)

Online, via Discord, English, CET (UTC+1h)

Those with other obligations are not required to join all hackathon events, as long as you submit your code before the deadline.

Deliverables:
1. Concept for Inverse Surveillance AI
2. Proof of Concept of Inverse Surveillance AI
3. A (video) pitch explaining your Proof of Concept

You can download the full Hackathon briefing in the link below. Here you can find a full description, the challenge expectation, guiding questions, Prices, Elaborate Timeline & Schedule, etc.

Timeline & Schedule

  • Preparation Month – Friday 15 Oct. – Friday 19 Nov.
    You are allowed to prepare your concept and write code
  • Pe-Hackathon Week – Friday 12 Nov. – Friday 19 Nov.
    • Q&A Session: Friday 12 Nov., 18:00-19:00 CET (UTC+1h)
  • Hackathon Day 1 – Friday 19 Nov. (18:30 – 20:30)
  • Hackathon Day 2 – Saturday 20 Nov. (09:00-18:00)
  • Hackathon Day 3 – Sunday 21 Nov. (09:00-18:30)
    • 15:00 CET (UTC+1h) Submit code, and (video) pitch

Join and make a difference!

Your Proof of Concept, in combination with the theoretical research and expert interview podcasts will serve as a launchpad for future research and work into the topic of Inverse Surveillance AI.

Inverse Surveillance offers a new pespective on the dynamic between citizens and bigger organisation and governments. AI makes this dynamic feasible. Inverse Surveillance AI can empower citizens and turn them into auditors keeping bigger organisations and goverments in check, and by doing so democratize technology in the process.

Your proof of concept has the power to demonstrate the potential of Inverse Surveillance AI and get this idea rolling.

Sign-Up & Questions

For sign-ups you can e-mail Juliette van der Laarse at juliette@asimovinstitute.org or contact her through LinkedIn

Two Examples for Inverse Surveillance

Posted on

Authors: J.P.R. van der Laarse & N.L. Neuman
Publication Date: September 24, 2021

Here we provide some metaphors as examples to better illustrate Inverse Surveillance. These metaphors are a representation of how we see inverse surveillance in comparison to other forms of surveillance and sousveillance at this moment in time. Throughout this project we aim to continue to refine this concept, and more clearly describe the differences between the different forms of veillance. 

Defining Surveillance

We use the terms surveillance and sousveillance as stand-alone concepts in these metaphors, based on the consensus within academic research. But surveillance could also be seen as an umbrella term for all activities. And the same is true for the term sousveillance with respect to all surveillance activities carried out by citizens, including inverse surveillance. 

The definitions used in these metaphors are based on our framework for inverse surveillance research. Prof. Steve Mann, the author on sousveillance, uses a broad veillance framework for veillances that encompasses surveilllance, sousveillance, inverse surveillance and other veillance concepts. He made the case for using veillance as the umbrella term instead of surveillance, which has different connotations.

1) Police Officer vs. Auditor

Inverse surveillance is by definition not anti-government in a dystopian sense, but pro-government from a utopian stance. Inverse surveillance provides citizens with leverage for holding a government accountable, which ought to be considered a positive effect in a functioning democratic society. For the Panopticon effect to work, there needs to be some level of threat. However, citizens will not take the role of a police officer, who issues fines based on criminal behaviour, and exercises power. Rather, citizens using inverse surveillance AI will essentially fulfill the role of an auditor. Auditors are also within their right to assess, correct, and sometimes enforce norms under the threat of specific consequences.However, an auditor is different from a police officer, since auditors report, while offering organizations also an opportunity for improvement. An auditor can be seen as an additional means of control to check that everything is running as it should within an organisation according to some normative framework. Despite the strict monitoring role of auditors, in which they directly hold organizations accountable for their behavior, independent auditors are frequently hired by organizations themselves to monitor their business and operations to ensure that they have everything in order when a formal audit occurs. This dynamic of organizations reaching out to auditors for help in auditing their systems and contributing ideas for improvement is exactly the kind of relationship our Inverse Surveillance project aims to stimulate between citizens and governments or other large /organizations. 

2) School examination

This metaphor relates to the different forms of veillance, and aims to illustrate the differences.

Surveillance: A teacher walks around during an exam to check if students are cheating. This is a form of power from above.

Counter-Surveillance: A student sits behind a pillar during an exam in protest, or sets their table up so that the teacher cannot perceive them properly. Whether the student cheats or not is irrelevant. The focus is on evading surveillance by the teacher. 

Sousveillance: The teacher walks past the tables and a student addresses their behaviour. For example, “Sir/Madam, I keep seeing you walking past the tables of students of colour. This is a form of discrimination”. The teachers’ surveillance is being observed and reported by a student.

Inverse Surveillance: The teacher walks past the students making their exam, without the students paying attention to it. Surveillance is part of this process and the students are not necessarily concerned about it. However, the students have set up a student council to evaluate the teachers and school system. Are they working fairly? What exactly is being surveilled? Have any processes crept in that lead to, for example, occurrences of racism? Or are there patterns that can be identified that indicate corruption? 

Panopticon for the Masses

Posted on

Authors: J.P.R. van der Laarse & N.L. Neuman
Publication Date: May 7, 2021

With security cameras in public places, police making their regular rounds in neighborhoods, proctors watching students during exams, and government organizations monitoring suspicious behavior online, surveillance is a part of our daily lives. Not only does such surveillance help spot and punish criminal behavior, it also has a psychological effect, and it is this effect that makes surveillance so effective. This is known as the Panopticon effect, first coined by Jeremy Bentham in the 18th century. 

What is the panopticon effect? 
In short, it means that when you know you can be watched, you will behave better. In a public place you are less likely to show bad behavior because you are aware that you can be watched. Thus, you correct your own behavior without the police or other enforcement agents having to intervene. It is this psychological self-policing mechanism (Foucault, 1997) that makes surveillance such a powerful tool.

Bentham first looked at the panopticon model in the context of prisons. And he articulated the dynamic, and requirements, needed to make the panopticon model work. Within this structure, the panopticon takes place inside an annular building of cell blocks, where at the center of the building a watchtower is positioned. Each person within a cell block (the subject) is sectioned off from the other ‘prisoners’ inside their cell block, leading to an individualization of the subjects. The officials within the tower (the observers) are invisible to the subjects, however they have total visibility of the subjects themselves, leading to an asymmetrical power relation. The end result of this surveillance structure is that the subjects create a self-regulating mechanism that replaces the anxiety of being watched and thus adhere to the institutional categories of evaluation and behave as is expected from them. As Foucault explained, “the major effect of the panopticon: to induce the inmate a state of conscious and permanent visibility that assures the automatic functioning of power”. (Foucault, 1977; Jezierksi, 2006).

According to Foucault, the panopticon model is as fascinating as it is frightening, and it illustrates Foucault’s views on the unequal power dynamic between citizens and government in general in the best possible way.

A Panopticon for the masses
To achieve Bentham’s form of a panopticon model, architectural change is required. It is the well-known dome prisons that are architecturally designed specifically for this purpose. Security cameras achieve the same effect. The subjects can be viewed undisturbed by the observers without the subjects being able to engage in dialogue with them. To make the panopticon a reality, either a lot of money is needed for architectural redesign, or enough money is needed to install means of large-scale observation – such as security cameras. Permission to build and install, as well as the financial resources are  often in the hands of the government and larger organizations. 

The democratization of AI, however, can be a game-changer for this dynamic. A simple algorithm can be developed at relatively little cost and function as thousands of observers. Not only is this useful for governments in the analysis of big data, this same tool can now be used by citizens to create a panopticon effect. AI thus makes surveillance by citizens, Inverse Surveillance, possible.

We do not see inverse surveillance as a counter-action to surveillance by governments and other large organizations. We merely acknowledge that citizens can now, through AI, create a panopticon effect of their own and thereby take part in the activity of surveillance. This presents opportunities in democratizing surveillance AI that we think are worth exploring. Within this research, we recognize that the panopticon effect works and citizens too can successfully use it as a tool. 

References:

  1. Foucault, M. (1977). Discipline and punish : The birth of the prison. Translated by Sheridan, A. New York: Pantheon Books.
  2. Jezierski, W. (2006). Monasterium panopticum. Frühmittelalterliche Studien, 40(1), 167-182.

On Utopian Thinking

Posted on

Authors: J.P.R. van der Laarse & N.L. Neuman
Publication Date: April 23, 2021

Surveillance AI is not exactly considered to be a positive development in this day and age, with controversial stories like China’s mass surveillance headlining many news platforms. (Andersen, 2020; BBC, 2021). These news items evoke a negative perception on AI and reminds us of movies like iRobot, Terminator, 2001: Space Odyssey, and Minority Report.This technophobic and dystopian view of Surveillance AI is part of the reason why ethical AI is a growing academic field. The focus of these studies lies primarily in preventing and countering this dystopian application of technology. However, despite the fact that these studies from a dystopian perspective are very much needed they mainly focus on limiting or governing these developments, and work from within existing structures and systems. It thus leaves little room for positive innovative developments.

Utopian Thinking
In order to get us to a future that opens up new possibilities in regards to Surveillance AI, instead of limiting them, we need a different approach to complement the dystopian work. Theory U teaches us that we need to be critical of our frame of mind, and preferably break out of our institutional bubble.This would enable innovation and accelerate the emerging future to take shape. (Scharmer, & Senge, 2016). We thus need a more out-of-the-box type of approach that is not limited by existing institutional structures. This approach stands at the center of Thomas More’s ‘Utopia’ (1516), imagining a perfect world in comparison to the world we are living in. Regardless of its attainability we focus on the thinking method itself.


Utopian Thinking has been at the foundation of many great technological innovations, for example the World Wide Web, and smartphones. Not to mention groundbreaking ideas, such as the theory of relativity, and the apartheid abolition (Hök, 2019). According to Brown (2015), it also facilitates collective thinking, which is essential for tackling complex problems “in these times of transformational change” (p.1). Bell and Pahl (2018) add that co-production – like using a thinktank for example – is a Utopian Thinking method. According to them (Bell & Pahl, 2018) Utopian Thinking methods are essential for reshaping the world as we know it for the better. In addition it encourages the public to become involved in the process (Fernando et al., 2018), which is precisely the type of citizen involvement we deem to be important for design, development, and implementation of Inverse Surveillance AI.

It is for these reasons that we approach our research from a utopian perspective, and therefore we encourage, imaginative, original, out of the box thinking, which follows the example of great thinkers that stood at the basis of monumental innovations and ideas (Hök, 2019). We need to look past our current way of thinking within existing structures, and build a new vision of what is socially acceptable in order to drive the growth and implementation of Surveillance AI (Harari, 2018). As Albert Einstein, emphasized: “we cannot solve our problems with the same thinking we used when we created them” (Kataria, 2019).

Bibliography:

  1. Andersen, R. (2020). The Panopticon Is Already Here. The Atlantic. Retrieved from https://www.theatlantic.com/magazine/archive/2020/09/china-ai-surveillance/614197/
  2. BBC. (2021). Uighur-identifying patent is ‘deeply disturbing’. BBC News. Retrieved from https://www.bbc.com/news/av/technology-55651932
  3. Bell, D.M., & Pahl, K. (2018). Co-production: Towards a utopian approach. International Journal of Social Research Methodology, 21(1), 105-117.
  4. Brown, V.A. (2015). Utopian thinking and the collective mind: Beyond transdisciplinarity. Futures : The Journal of Policy, Planning and Futures Studies, 65, 209-216.
  5. Fernando, J. W., Burden, N., Ferguson, A., O’Brien, L. V., Judge, M., & Kashima, Y. (2018). Functions of Utopia: How Utopian Thinking Motivates Societal Engagement. Personality and social psychology bulletin, 44(5), 779-792. https://doi.org/10.1177/0146167217748604
  6. Harari, Y. N. (2018). 21 lessons for the 21st century (First ed.). Random House USA.
  7. Hök, B.W.  (2019). Are great innovations driven by utopian ideas? Journal of Innovation Management, 6(4), 98-116.
  8. Kataria, V. (2019). 3 Lessons from Albert Einstein on Problem Solving. Medium, The Startup. Retrieved from https://medium.com/swlh/3-lessons-from-albert-einstein-on-problem-solving-c5438b2ac2b9
  9. More, T. (1516). Utopia. Retrieved from Planet Ebook: https://www.planetebook.com/utopia/
  10. Scharmer, C., & Senge, P. (2016). Theory U : Leading from the future as it emerges : The social technology of presencing (Second ed.).

Conceptualizing Inverse Surveillance

Posted on

Authors: J.P.R. van der Laarse & N.L. Neuman
Publication Date: April 23, 2021

In our new project, we focus on unwrapping the concept of Inverse surveillance, and how it can be used to empower citizens with AI technology. Since we wanted to place surveillance in the hands of citizens, the first name that popped in mind to label this utopian vision on surveillance was ‘Inverse Surveillance’. After a quick Google search, we found out that this term has actually been used before, so we did a deep dive into the literature. We soon learned that Inverse Surveillance is often used as a synonym (or translation) for sousveillance (Mann, 2004), and also mentioned in relation to counter-surveillance. However, neither of these concepts fully captures what we were going for. We decided to flesh out this concept a bit more and write down what we think are the main distinctions between the different types of surveillance. 


For those interested, we will publish how we came to these distinctions and our definition of inverse surveillance based on the literature in another post, but in this post, we will focus on the table below, and our conclusions.

 SurveillanceCounter-
surveillance
SousveillanceInverse Surveillance
AgentTopBottomBottomBottom
SubjectBottomTopTop & BottomTop
ActionSurveillanceEvading & UnderminingSurveillance & gaining more insight and involvementSurveillance
GoalControlling and Influencing subjectCounter-reaction against surveillance of citizensCounter-reaction against surveillance of citizensControlling and influencing subject
Power DynamicCentralization of PowerChallenging institutional power asymmetriesReversing the balance of power (hierarchical sousveillance); levelling the balance of power (personal sousveillance).Democratisation of Power

Surveillance

Although ‘surveillance’ is also an umbrella term for the other concepts, in its colloquial use surveillance refers to The systematic monitoring (surveillance) of citizens (bottom) by governments or bigger organizations (top), in order to influence and control them (goal) and thus exercise power (power dynamic) (Ball et al., 2012; Hier & Greenberg, 2014; Lyon, 2007).

Counter-surveillance

In the case of counter-surveillance, citizens (bottom) actively evade and undermine surveillance by governments and bigger organizations (top) as a counter-reaction to the surveillance of citizens (goal) and by doing so are challenging institutional power asymmetries (power dynamic) (Monahan, 2006).

Sousveillance

Sousveillance happens when citizens (bottom) are surveilling governments and bigger organizations (top) with the goal to gain more insight and involvement into surveillance, as a counter-reaction against the surveillance of citizens (goal) and by doing so reversing or leveling the power balance (power dynamic) (Mann, 2004; Mann et al., 2002). 

Conceptualizing a fourth surveillance type

The exact definition of sousveillance is quite broad. Some articles focus on sousveillance as a means of gaining insight into surveillance done by governments and bigger organizations by surveilling the agent itself. In most articles, sousveillance often takes a ‘stance against’ surveillance. In other articles, all surveillance activities in which citizens partake in surveillance are included in the sousveillance concept. 

The latter is a bit closer to what we aim to focus on. Thus according to existing terminology, our project would fall under sousveillance. We, however, wanted to make one clear distinction between the ‘anti’ movement also present within sousveillance. And thus we decided to separate the term inverse surveillance from sousveillance and give it a bit more depth. Whether we can view our definition of inverse surveillance as part of the umbrella term sousveillance or not is up for debate but not what we are focussing on. 

Our definition
Inverse Surveillance

In the case of inverse surveillance, citizens (bottom) surveil governments and bigger organizations (top) in order to control and influence (goal) and thus promote transparency and equality, and by doing so democratizing power (power dynamic).

This definition is not definite yet, and it might change during the research. But we wanted to offer a clear starting point for fleshing out a new surveillance concept. 

What we want to emphasize with this distinction is that our perspective on surveillance as a method is closer to surveillance than it is to sousveillance. In our case the focus is not surveillance itself, surveillance is seen as a mere tool that we deem helpful in exercising power, control, and influencing the subject. The difference with surveillance, however, and what puts us in line with sousveillance is that in our case surveillance is done from the bottom to the top.

Facilitating Inverse Surveillance through Artificial Intelligence

Our suggestion to deepen the definition of Inverse Surveillance is the product of technological advancements through which ideas like these are becoming more realistic for the first time in history. In Foucault’s (1977) book, surveillance can only be used by those in power, due to the extensive resources needed to conduct large-scale surveillance (for example, by having a police force that can patrol). With the rise of AI, we no longer need hundreds of eyes to watch data, videos, or images. This makes AI a realistic tool not only for organizations to monitor individuals but also for individuals monitoring organizations, without needing the extensive resources organizations have. For this reason, our project focuses on employing AI to facilitate Inverse Surveillance.

Utopian Vision on Inverse Surveillance AI 

In this project, we focus on a utopian way of thinking. We realize that there are also many side effects to AI such as ethical complications, and these studies from a dystopian perspective are therefore also much needed. However, within this project, we are mainly looking for solutions, and innovative ideas to get this concept off the ground. Thus, from a utopian perspective, we focus not only on the possibilities of inverse surveillance but also on the broader role AI can play in society in this regard.

Throughout this project, our definition of inverse surveillance as elaborated upon here will serve as a starting point for our research. Building on this, we will focus on the utopian vision and the practical application of AI in the context of Inverse Surveillance. 

Bibliography:

  1. Ball, K., Haggerty, K., & Lyon, D. (2012). Routledge handbook of surveillance studies (Routledge international handbooks). Abingdon, Oxon ; New York: Routledge
  2. Foucault, M. (1977). Discipline and punish: The birth of the prison. New York: Pantheon Books.
  3. Hier, S., & Greenberg, J. (2014). Surveillance power, problems, and politics. Vancouver: UBC Press.
  4. Lyon, D. (2007). Surveillance studies : An overview. Cambridge, UK ; Malden, MA: Polity.
  5. Mann, S. (2004). Sousveillance: inverse surveillance in multimedia imaging. Proceedings of the 12th ACM International Conference on Multimedia, New York, NY, USA, October 10-16, 2004. 620-627. DOI: 10.1145/1027527.1027673.
  6. Mann, S., Nolan, J., & Wellman, B. (2002). Sousveillance: Inventing and Using Wearable Computing Devices for Data Collection in Surveillance Environments. Surveillance & Society, 1(3), 331-355.
  7. Monahan, T. (2006). Counter-surveillance as Political Intervention? Social Semiotics, 16(4), 515-534.

Podcast: Creativity and Constraint in Artificial and Biological Intelligence

Posted on

The Brain Inspired podcast approached us for a conversation about Creativity and Constraint in Biological and Artificial Intelligence. We cover generating art with neural networks, AI’s challenges for neuroscience, and how the infamous frame problem in AI traces all the way back to Plato.

Listen to it on iTunes, Spotify, or below:

Brain Inspired podcast 062 Stefan Leijnen: Creativity and Constraint

Neural Network Zoo Prequel: Cells and Layers

Posted on

Cells

The Neural Network Zoo shows different types of cells and various layer connectivity styles, but it doesn’t really go into how each cell type works. A number of cell types I originally gave different colours to differentiate the networks more clearly, but I have since found out that these cells work more or less the same way, so you’ll find descriptions under the basic cell images.

A basic neural network cell, the type one would find in a regular feed forward architecture, is quite simple. The cell is connected to other neurons via weights, i.e. it can be connected to all the neurons in the previous layer. Each connection has its own weight, which is often just a random number at first. A weight can be negative, positive, very small, very big or zero. The value of each of the cells it’s connected to is multiplied by its respective connection weight. The resulting values are all added together. On top of this, a bias is also added. A bias can prevent a cell from getting stuck on outputting zero and it can speed up some operations, reducing the amount of neurons required to solve a problem. The bias is also a number, sometimes constant (often -1 or 1) and sometimes variable. This total sum is then passed through an activation function, the resulting value of which then becomes the value of the cell.

Convolutional cells are much like feed forward cells, except they’re typically connected to only a few neurons from the previous layer. They are often used to preserve spatial information, because they are connected not to a few random cells but to all cells in a certain proximity. This makes them practical for data with lots of localised information, such as images and sound waves (but mostly images). Deconvolutional cells are just the opposite: these tend to decode spatial information by being locally connected to the next layer. Both cells often have a lot of clones which are trained independently; each clone having it’s own weights but connected exactly the same way. These clones can be thought of as being located in separate networks which all have the same structure. Both are essentially the same as regular cells, but they are used differently.

Pooling and interpolating cells are frequently combined with convolutional cells. These cells are not really cells, more just raw operations. Pooling cells take in the incoming connections and decide which connection gets passed through. In images, this can be thought of as zooming out on a picture. You can no longer see all the pixels, and it has to learn which pixels to keep and which to discard. Interpolating cells perform the opposite operation: they take in some information and map it to more information. The extra information is made up, like if one where to zoom in on a small resolution picture. Interpolating cells are not the only reverse operation of pooling cells, but they are relatively common as they are fast and simple to implement. They are respectively connected much like convolutional and deconvolutional cells.

Mean and standard deviation cells (almost exclusively found in couples as probabilistic cells) are used to represent probability distributions. The mean is the average value and the standard deviation represents how far to deviate from this average (in both directions). For example, a probabilistic cell used for images could contain the information on how much red there is in a particular pixel. The mean would say for example 0.5, and the standard deviation 0.2. When sampling from these probabilistic cells, one would enter these values in a Gaussian random number generator, resulting in anything between 0.4 and 0.6 being quite likely results, with values further away from 0.5 being less and less likely (but still possible). They are often fully connected to either the previous or the next layer and they do not have biases.

Recurrent cells have connections not just in the realm of layers, but also over time. Each cell internally stores its previous value. They are updated just like basic cells, but with extra weights: connected to the previous values of the cells and most of the time also to all the cells in the same layer. These weights between the current value and the stored previous value work much like a volatile memory (like RAM), inheriting both properties of having a certain “state” and vanishing if not fed. Because the previous value is a value passed through an activation function, and each update passes this activated value along with the other weights through the activation function, information is continually lost. In fact, the retention rate is so low, that only four or five iterations later, almost all of the information is lost.

Long short term memory cells are used to combat the problem of the rapid information loss occurring in recurrent cells. LSTM cells are logic circuits, copied from how memory cells were designed for computers. Compared to RNN cells which store two states, LSTM cells store four: the current and last value of the output and the current and last values of the state of the “memory cell”. They have three “gates”: input, output, forget, and they also have just the regular input. Each of these gates has its own weight meaning that connecting to this type of cell entails setting up four weights (instead of just one). The gates function much like flow gates, not fence gates: they can let everything through, just a little bit, nothing, or, anything in between. This works by multiplying incoming information by a value ranging from 0 to 1, which is stored in this gate value. The input gate, then, determines how much of the input is allowed to be added to the cell value. The output gate determines how much of the output value can be seen by the rest of the network. The forget gate is not connected to the previous value of the output cell, but rather connected to the previous memory cell value. It determines how much of the last memory cell state to retain. Because it’s not connected to the output, much less information loss occurs, because no activation function is placed in the loop.

Gated recurrent units (cells) are a variation of LSTM cells. They too use gates to combat information loss, but do so with just 2 gates: update and reset. This makes them slightly less expressive but also slightly faster, as they use less connections everywhere. In essence there are two differences between LSTM cells and GRU cells: GRU cells do not have a hidden cell state protected by an output gate, and they combine the input and forget gate into a single update gate. The idea is that if you want to allow a lot of new information, you can probably forget some old information (and the other way around).

Layers

The most basic way of connecting neurons to form graphs is by connecting everything to absolutely everything. This is seen in Hopfield networks and Boltzmann machines. Of course, this means the number of connections grows exponentially, but the expressiveness is uncompromised. This is referred to as completely (or fully) connected.

After a while it was discovered that breaking the network up into distinct layers is a useful feature, where the definition of a layer is a set or group of neurons which are not connected to each other, but only to neurons from other group(s). This concept is for instance used in Restricted Boltzmann Machines. The idea of using layers is nowadays generalised for any number of layers and it can be found in almost all current architectures. This is (perhaps confusingly) also called fully connected or completely connected, because actually completely connected networks are quite uncommon.

Convolutionally connected layers are even more constrained than fully connected layers: we connect every neuron only to neurons in other groups that are close by. Images and sound waves contain a very high amount of information if used to feed directly one-to-one into a network (e.g. using one neuron per pixel). The idea of convolutional connections comes from the observation that spatial information is probably important to retain. It turned out that this is a good guess, as it’s used in many image and sound wave based neural network applications. This setup is however less expressive than fully connected layers. In essence it is a way of “importance” filtering, deciding which of the tightly grouped information packets are important; convolutional connections are great for dimensionality reduction. At what spatial distance neurons can still be connected depends on the implementation, but ranges higher than 4 or 5 neurons are rarely used. Note that “spatial” often refers to two-dimensional space, which is why most representations show three-dimensional sheets of neurons being connected; the connection range is applied in all dimensions.

Another option is of course to randomly connected neurons. This comes in two main variations as well: by allowing for some percentage of all possible connections, or to connect some percentage of neurons between layers. Random connections help to linearly reduce the performance of the network and can be useful in large networks where fully connected layers run into performance problems. A slightly more sparsely connected layer with slightly more neurons can perform better in some cases, especially where a lot of information needs to be stored but not as much information needs to be exchanged (a bit similar to the effectiveness of convolutionally connected layers, but then randomised). Very sparsely connected systems (1 or 2%) are also used, as seen in ELMs, ESNs and LSMs. Especially in the case of spiking networks this makes a lot of sense, because the more connections a neuron has, the less energy each weight will carry over, meaning less propagating and repeating patterns.

Time delayed connections are connections between neurons (often from the same layer, and even connected with themselves) that don’t get information from the previous layer, but from a layer from the past (previous iteration, mostly). This allows temporal (time, sequence or order) related information to be stored. These types of connections are often manually reset from time to time, to clear the “state” of the network. The key difference with regular connections is that these connections are continuously changing, even when the network isn’t being trained.

The following image shows some small sample networks of the types described above, and their connections. I use it when I get stuck on just exactly what is connected to what (which is particularly likely when working with LSTM or GRU cells):

Analyzing Six Deep Learning Tools for Music Generation

Posted on

As deep learning is gaining in popularity, creative applications are gaining traction as well. Looking at music generation through deep learning, new algorithms and songs are popping up on a weekly basis. In this post we will go over six major players in the field, and point out some difficult challenges these systems still face. GitHub links are provided for those who are interested in the technical details (or if you’re looking to generate some music of your own).

Magenta

Magenta is Google’s open source deep learning music project. They aim to use machine learning to generate compelling music. The project went open source in June 2016 and currently implements a regular RNN and two LSTM’s.
GitHub: https://github.com/tensorflow/magenta
Great, because: It can handle any monophonic midi file. The documentation is good, so it’s relatively easy to set-up. The team is actively improving the models and adding functionality. For every model Magenta has provided a training bundle that is trained on thousands of midi files. You can start generating new midi files right away using these pre-trained models.
Challenges: At this point, Magenta can only generate a single stream of notes. Efforts have been made to combine the generated melodies with drums and guitars – but based on human input, as of yet. Once a model that can process polyphonic music has been trained, it could start to create harmonies (or at least multiple streams of notes). This would indeed be a mighty step on their quest for the generation of some compelling music.
Sounds like: The piece below is generated by Magenta from the 8th note onward. Here they use their attention model with the provided pre-trained bundle.

DeepJazz

The result of a thirty-six-hour hackathon by Ji-Sung Kim. It uses a two layer LSTM that learns from a midi file as its input source. DeepJazz has received quite some news coverage in the first six months of its existence.
GitHub: https://github.com/jisungk/deepjazz
Great, because: Can create some jazz by being trained on a single midi file. The project itself is also compelling proof that creating a working computational music prototype using deep learning techniques can be a matter of hours thanks to libraries like Keras, Theano & Tensorflow.
Challenges: While it can handle chords, it converts the jazz midi to a single pitch and single instrument. It would take a few more post-processing steps for the deep learning created melodies to sound more like human created jazz music.
Sounds like: The following piece is generated after 128 epochs (i.e. the training set consisting of a single midi file has cycled through the model that many times).

BachBot

A research project by Feynman Liang at Cambridge University,  also using an LSTM. This time it is used to train itself on Bach chorales. It’s goal is to generate and harmonize chorales indistinguishable from Bach’s own work. The website offers a test where one can listen to two streams and guess which one is an actual composition by Bach.
GitHub: https://github.com/feynmanliang/bachbot/
Great, because: Research found that people have a hard time distinguishing generated Bach from the real stuff. Also, this is one of the best efforts in handling polyphonic music as the algorithm can handle up to four voices.
Challenges: BachBot works best if one or more of the voices are fixed. Otherwise the algorithm just generates wandering chorales.The algorithm could be used to add chorales to a generated melody.
Sounds like: In the below example the notes for “Twinkle Twinkle Little Star” were fixed, with the chorales generated.

FlowMachines

In the picturesque city of Paris, a research team is working on a system that can help to keep an artist in a creative flow. Their system can generate leadsheets based on the style of a composer in a database filled with about 13000 sheets. Markov constraints are used here as neural network technique.
GitHub: not open source.
Great, because: The system has composed the first AI pop-songs.
Challenges: Producing pop songs from a generated leadsheet to these pop songs is not simply done at the click of a button – it still requires a well-skilled musician to create a compelling song like in the example below. Reducing the difficulty of these steps with the help of deep learning is still an open challenge.
Sounds like: The song is composed by the FlowMachines AI. In order to do so, the musician chose the “Beatles” style, and generated melody and harmony. Note the rest of the score (production, mixing, and assigning audio pieces to the notes) was produces by human composer.

WaveNet

Researchers at Google’s DeepMind have created Wavenet. Wavenet is based on Convolutional Neural Networks, the deep learning technique that works very well in image classification and generation in the past few years. Their most promising purpose is to enhance text-to-speech applications by generating a more natural flow in vocal sound. However, their method can also be applied to music as both the input and output consists of raw audio.
GitHub: WaveNet’s code is not open source, but others have implemented it based on DeepMind’s documentation. For example: https://github.com/ibab/tensorflow-wavenet
Great, because: It uses raw audio as input. Therefore it can generate any kind of instrument, and even any kind of sound. It will be interesting to see what this technique is capable of once trained on hours of music.
Challenges: The algorithm is computationally expensive. It takes minutes to train on a second of sound. Some have started to create a faster version. Another researcher working for Google, Sageev Oore from the Magenta project, has written a blog post where he describes what can be learned from the musical output of Wavenet. One of his conclusions is that the algorithm can produce piano notes without a beginning, making them unplayable on a real piano. Interestingly, Wavenet can extend the current library of sounds that a piano can create and produce a new form of piano music – perhaps the next step in (generated) music.
Sounds like: Trained on a dataset of piano music results in the following ten seconds of sound:

GRUV

A Stanford research project that, similar to Wavenet, also tries to use audio waveforms as input, but with an LSTM’s and GRU’s rather than CNN’s. They have showed their proof of concept to the world in June 2015.
GitHub: https://github.com/MattVitelli/GRUV
Great, because: The Stanford researchers were one of the first to show how to generate sounds with an LSTM using raw waveforms as input.
Challenges: The demonstration they provide seems over-fitted on a particular song, due to the small training corpus and the sheer amount of layers of the NN. The researchers themselves did not have the time nor computational power to experiment further with this. Fortunately, this void is starting to get filled by researchers from WaveNet and other enthusiasts. Jakub Fiala has used this code to generate an interesting amen drum break, see this blog post.
Sounds like: The tool trained on a variety of Madeon songs, resulted in the below sample. Until 1:10 is an excerpt of the creation after 100 up to 1000 iterations, after that is a mash-up of their best generated pieces. This excerpt is a recording of this video.

Notes VS Waves

The described deep learning music applications can be divided into two categories based on the input method. Magenta, DeepJazz, BachBot, and FlowMachines all use input in the form of note sequences, while GRUV and Wavenet use raw audio.

Input type: Note sequences Raw audio
Computational complexity Low (minutes – few hours) High (few hours – days)
Editable result Yes, can be imported in music production software No, waveform itself has to be edited
Musical complexity As complex as a single song from the corpus As complex as the combination of the entire corpus

Can we call out a clear winner? In my opinion: no. Each has different applications and these methods can coexist until generating compelling music with raw audio becomes so fast that there is simply no point in doing it yourself.

Music will be easier to create by people who are assisted by an AI that can suggest a melody or harmony. However, these people still need to be musicians (for now). The moment it is possible to train a deep learning algorithm on your entire Spotify history in raw audio form, and generate new songs, everyone can be a musician.

Image classification and generation has been improved with neural network techniques, reaching higher benchmark scores than ever before, mostly thanks to the speed at which huge sets of pixels can be trained. For audio the overarching question is: when will raw audio overtake notes as the pixel of music?


Did you miss anything, or do you have any other feedback? Comments are greatly appreciated. At the Asimov Institute we do deep learning research and development, so be sure to follow us on Twitter for future updates and posts!  In this post we did no go into the technical details, but if you’re new to deep learning or unfamiliar with a method, I refer you to one of our previous posts on neural networks.

We are currently working on generating electronic dance music using deep learning. If you want to share your ideas on this, or have some interesting data to show, please send a message to frankbrinkkemper@gmail.com. Thank you for reading!