Posted in: Schools, Uncategorized

# G. Markov Chain Monte Carlo

Markov chain Monte Carlo. This is an extremely important technique for sampling from a complicated distribution by running a Markov chain. The main algorithm is the Metropolis or Metropolis-Hasting algorithm, but there are many others.

• Introduction: describe the problem you are solving.
• Background you are assuming. Assumptions for distributions – populations, etc.
• Results, proof, examples

Once you understand all the involved key concepts; you can delve in their applications: deep learning or machine learning.

Markov chain: – statistic theories.

• Harvard 110 Lecture 31-33
• Simon’s notes
• Examples

Markov Chain: network is key. See if you can model this example. Pay attention to the properties of this MC. The properties might be hypothesis and/or conditions for we to apply this. It is also the points which we can negate this technique when we prove that the hypothesis and or conditions are not true.

This video is for fun. But it relates philosophy to computer science which is interesting.

Monte Carlo – simulation method (algorithm) – steps.

• simulation MIT
• Simon’s notes

(MIT Introduction to Computational Thinking and Data Science Series)

etc.

## https://youtu.be/h1NOS_wxgGg

Introduction: This topic combines knowledge from three disciplines: maths, statistics, and data science (computational thinking). Statistics tries to study properties of population by using inferences from sampling. Thus not only you need to learn theories of probability and statistics but also you need to make sure sampling (data you pick up) can reflect the true properties of the population. The former (theoretic part) is maths-probabilities and the latter (data mining) is more and more computational.

So pay attention to

• how to use math to reflect the problems: example 1. expected value = average taking in account of probability of each value. example 2. transition matrix: present the state to state in matrix.
• how theories are related to computation: from statistics viewpoint, computer is only working on sampling – data science. That is, we understand Markov Chain first, then we use Monte Carlo simulation to generate data. But with AI, this correlation between computation and theories is becoming more integrated. Computation (data analysis) can give us new ideas. Theories can be proved by data, generate data based on certain distribution, and applied to real world.
• In generating data, we also related it to the real world. This is distributions come into the picture. Again, how to use math equations to represent the populations’ distributions. modeling; here math is very important again. Pay attention to hypothesis and/conditions here. These are theories from various disciplines: economics, medical, etc. …

Statistics: Data, Sample vs population; mean, median, mode, range & deviance/variance, Distributions

Types of Distributions: Gaussian (normal), Beyesian probability, exponential decay function, uniform distribution, proposal distribution.

This video list tells you what prerequisite knowledge you need for understanding MCMC. You need to understand probability and distribution – the basics of statistics or data science.

## Mathieu Groups

• Abstract
• Define a prerequisite: Steiner System, proof, and problems with this system.
• Mathieu Groups: Definitions; theorems and their proof
• Codes: Binary Golay Codes.
• description
• Significance in theory and in practice.
• result and proof
• example of the game of nim. related theorems and definitions, proof. etc.
• References

## Explaining the Shape of RSK

• Introduction: describe the algorithm RSK.
• “Let us illustrate how the RSK algorithm is performed.”
• There are many possible questions that one can ask about the RSK algorithm.
• Monotone Subsequences: definition, theorem, proof, and examples.
• Knuth’s equivalence relation
• The shape of RSK
• Note: problems in the above work and suggestions for future research or something that is important but not discussed in this paper.
• References