Skip to content

Session 3: Basics of AI

We continue with the history of AI and get to the basic concepts and terminology, and the main paradigms of learning.

Plan

Homework

  • We compare the generated history overviews.

Results

  • Typisches Ergebnis:
  • Alan Turing
  • Dartmoor summer school als erstes Auftauchen des Begriffs
  • KI Winter bis in die 80er/90er
  • 90er, Erfolge mit Machine Learning (Deep Blue)
  • ab 2012 Deep Learning mit AlexNet
  • seit 2020er GenAI und das Ziel AGI

Quiz: Guess the decade of this audio sample

  • Play the audio sample and let students guess the decade it is from.

History of AI - Milestones (continued)

  • General idea of deep learning:
    • We listen to the Turing award winners Geoffrey Hinton and Yann LeCun and their lectures about "The Deep Learning Revolution" from 2019.
  • We differentiate between reinforcement, supervised and self-supervised learning.
  • What happened in the last few years after GPT-3?

Learning Paradigms: Supervised, Reinforcement, and Self-Supervised Learning

  • We discuss the differences between supervised, reinforcement, and self-supervised learning.
  • We discuss the classroom exercise (or short project) that helps students build an intuitive, hands‑on understanding of the differences of these learning paradigms.
  • We first find out how a human would learn the same skill using an equivalent approach.
  • Then we run and understand small implementations / simulations for each learning paradigm.

Materials

Code examples

Audio samples for the course

Another secret speech about AI, and students have to guess the year.

(Urheber: Maximilian Schönherr, Quelle wird nachgereicht)

Sources for shown image and video samples

  • Geoffrey Hinton and Yann LeCun and their lectures about "The Deep Learning Revolution" from 2019.

Results

Remarks about the audio quiz

The recording is from 1986. The computer scientist Hans-Werner Hein talks about the "bad image of AI". The recording was recorded at a publich meeting in Munich on September 12th 1986 by Maximilian Schönherr and can be found on Wikipedia. The students estimated the year between 1990 and 2005 as the terminology used is quite modern and the challenges mentioned are still relevant today. However the audio quality and the style of speaking gave away the older date.

Notes from the lectures

  • Geoffrey Hinton starts with two paradigms in computer science: Logical and biological.
  • The logical approach is based on rules, symbols and reasoning. The typical application is to create an intellligent design and program it.
  • The biological approach does not involve any language, but vectors of data. In applications a program is implemented that learns from data.
  • Hinton then states the question how far we can go with the biological approach.
  • Early research showed that learning is inefficient if there is no strategy. Only after the utilization of backpropagation for multi-layer networks, it became possible to learn practical applications like speech recognition and hand-written digit classification.
  • Beside reinforcement learning (which is skipped by Hinton), there is supervised learning, where labeled data is used to train a model and unsupervised learning, where no labels are used and the model has to find structure in the data. Both approaches did not work well for many practical applications using neural networks.
  • The scientific community did not accept neural networks until there superior performance in practical applications was demonstrated. This was first done for speech, then for image recognition (ImageNet) and later for natural language processing (transformers).
  • The key factors for the success of deep learning were the availability of large datasets (e.g. ImageNet) and increased computational power (mainly due to affordable and programmable GPUs).
  • Hinton further explains that while CNNs worked well, they are not how humans classify images. Human vision relies on coordinate frames. He demonstrated the effect by asking the audience to imagine a cube. This cube should then be rotated in their minds so that the rear, upper, left corner is directly above the front, lower, right corner. Then they should indicate the positions of the remaining corners. Most people indicated four corners, even though a cube has eight corners. This shows that humans think in coordinate systems and cannot cope with transformations that change the coordinate system. He refers to the paper Stacked Capsule Autoencoders that demonstrates an unsupervised learning apporach for image classification.
  • He concludes the talk with the vision that future neural nets should not just train and fix weights as in long-term memory, but should also obey other time-scales. It might be possible to include a mechanism for short-term memory that can rapidly adapt to new situations. This would resemble the human brain more closely.
  • Yann LeCun continues with the sequel. He starts with a couple of examples where supervised learning works well, but the amount of available GPU memory limits the size of the models.
  • He then states that reinforcement learning is not efficient enough for real-world applications. The reason is that the agent has to explore a large state space and many actions have to be tried out to find a good policy. This is not feasible in real-world applications. It can be harmful and is slow.
  • He presents the outcomes of Dupoux (see Emmanuel Dupoux: How Do Infants Bootstrap into Spoken Language?: Models and Challenges) who studied how babies learn various abilities and concludes that self-supervised learning should solve the issue.
  • The idea is to let the machine predict a lot of potential outcomes instead of just a single one. He shows a cake metaphor where the main part of the cake (géniose) is self-supervised learning, the icing is supervised learning and the cherry on tpp is reinforcement learning.
  • He then presents some examples of self-supervised learning.
  • He claims that the latent space is crucial for self-supervised learning. The model has to learn a good representation of the data that captures the relevant features. GANs are one way to achieve this. The other way is to use contrastive learning where the model learns to distinguish between similar and dissimilar data points. (Note that it is not clear whether LeCun really refers to contrastive learning here, as he does not mention it explicitly.)
  • Important to note is that language is much simpler than vision as it is discrete and compositional. This makes it easier to learn representations.
  • He draws some ideas about how to control the self-supervised learning process.
  • He concludes the talk with the observation that several times in history, the invention came before the theory. His hope is that the invention of deep learning leads to a better understanding of intelligence.

Homework

The students will work on the tasks at home. For the human learning part, they have chosen the following topics:

  • Schwimmen: Leon+Flavio
  • Spanisch: Athanasios, Isabell
  • Violine: Alyssa, Chantal
  • Reiten: Ali

The results shall be uploaded to the shared folder in FELIX until next week.

Side results

  • Bayesian inference is not covered in this course, but might be important for understanding AI. A good resource is the book: H. Pishro-Nik, "Introduction to probability, statistics, and random processes", Kappa Research LLC, 2014. More specifically, the chapter on Bayesian inference.