Teaching

2021.2
Big Data and Astroinformatics:
Slides:
Exercises:
1. Create a new column from a FITS catalog and save the new catalog
2. Estimate the limiting magnitude of the provided catalog. Use the relation between mag_err and S/N
3. Create your catalog using SExtractor ( astromatic.net/software/sextractor/ ). The command should be:sextractor file_name.fits -c config.sex -PARAMETERS_NAME sex.param -CATALOG_NAME output.fits -CATALOG_TYPE FITS_1.0 -GAIN 8116 -PIXEL_SCALE 0.06 -SEEING_FWHM 0.1 -MAG_ZEROPOINT 25.6651 -PHOT_FLUXFRAC 0.682 -DEBLEND_MINCONT 0.0015 -DEBLEND_NTHRESH 32
The image file is here FF - HST-Image in f435 band (replace “file_name.fits”). Then compare with catalogs from Molino et al. 2017 , catalog for the given image is presented hereYou have to create your sex.param file. Please refer to Source Extraction documentation.
For the lecture on Tuesday, Sept-28, please install:
Arianna Cortesi: http://www.star.bris.ac.uk/~mbt/topcat/ https://aladin.u-strasbg.fr/4. Derive your Hubble constant from several datasets hubble_fit_data
5. Demonstrate that an Ordinary Least Squares fit is equivalent to a Maximum Likelihood of a Gaussian variable with a linear model
6. Demonstrate that the Maximum Likelihood Method for a Bernoulli variable with a linear model is equivalent to minimizing the cross-entropy between predictions and measured variables.
7. Demonstrate that the minimum of cross-entropy is obtained when the two probabilities are equal.
2020.2
Deep Learning
It is recommended to create a repository such as GitHub/GitLab to publish your solutions. Try building your solutions using Python notebooks. Task solutions can be posted at: https://github.com/CBomDeepLearningClass2020b
The final course project will consist of a seminar of up to 20 minutes (followed by questions) about the work developed, accompanied by a notebook that must be posted in the CBomDeepLearningClass2020b organization.
The following will be considered as grading criteria:
-> Notebook clarity. Didactic code, well documented with rich explanations.
-> Mastery of the topic. What differentiates your model from other possible models? How and why was the architecture defined/considered adequate?
-> Metrics chosen to evaluate the problem. Did the training converge? Was there overfitting? What were the attempts to remove it? How do we know the model works and with what metric?
-> Data treatment.
Notebook submission deadline (hard limit) is 12/11/2020. By then, all must have presented.
Proposed projects for 2020.b
1- Not so few, not so Big. What is the data limit to train and converge gravitational arc identification models (Images)?
2- What is the detection limit for a gravitational lens (images)?
3 - Few shot learning for classification of gravitational lenses (images).
4 - Model for star/galaxy separation (tabular/images).
5 - Uncertainty for regression in gravitational lenses. (inverse modeling in images).
6 - Exoplanet detection https://www.kaggle.com/keplersmachines/kepler-labelled-time-series-data (sequence)
7 - Galaxy morphological classification with Deep Learning https://www.kaggle.com/c/galaxy-zoo-the-galaxy-challenge/data (images)
8 - Transient classification using photometry https://www.kaggle.com/c/PLAsTiCC-2018 (sequence)
9 - Credit card fraud detection https://www.kaggle.com/mlg-ulb/creditcardfraud (tabular)
10 - Pneumonia detection https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia (images)
11 - Cardiac arrest mortality prediction https://www.kaggle.com/andrewmvd/heart-failure-clinical-data/ (tabular)
12 - Wine quality prediction https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009/ (tabular)
13 - Heartbeat classification https://www.kaggle.com/kinguistics/heartbeat-sounds (audio)
14 - Sentiment analysis from audio https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio (audio)
15 - CT segmentation for COVID-19 detection https://www.kaggle.com/andrewmvd/covid19-ct-scans (Images / Semantic segmentation)
Tasks
Task 1 - Diabetes diagnosis Diagnosis dataset
Task 2 - First dense network with MNIST
Task 3 - First CNN with MNIST
Task 4 - Classification in a mini-challenge:
https://bitbucket.org/kognitalab/images_mini_challange/src/master/Part 2: Evaluate your results by generating a ROC curve and calculating its area.
Part 3: Test the robustness by shuffling training and test sets multiple times and computing average and standard deviation of the ROC.
Part 4: Evaluate the overfitting of your network.
Part 5: Add another Conv+ReLu+Maxpool block and repeat step 3. Did it improve? Any insights?
Task 5 - First kaggle challenge: https://www.kaggle.com/c/titanic/overview/evaluation
Task 6 - Do it yourself:
Goal: Without Keras or Tensorflow, implement and train a network from scratch in Python.
Task 7 - Regression on kaggle: https://www.kaggle.com/c/house-prices-advanced-regression-techniques
1. Build a model with Random Forest
2. Build a model with LightGBM
3. Make an Ensemble
Task 8 - Apply your architecture to the Strong Lensing dataset and compare.
Task 9 - Time series with Deep Learning
1 - Follow the tutorial: aboveintelligent.com https://aboveintelligent.com/time-series-analysis-using-recurrent-neural-networks-lstm-33817fa4c47a
2 - Follow the tutorial: Kaggle https://www.kaggle.com/amirrezaeian/time-series-data-analysis-using-lstm-tutorial
3 - Implement an LSTM for the rainfall classification challenge https://www.kaggle.com/c/how-much-did-it-rain#description
Replicate the winning solution: https://github.com/danzelmo/how-much-did-it-rain
Compare your results using the official metric: https://www.kaggle.com/c/how-much-did-it-rain#evaluation
Task 10 - AE vs PCA
Show that PCA is equivalent to AE with linear functions under a specific loss function.
2019.2
- Basic Electricity
- Introduction to Scientific Methodology
Attention: due to force majeure, Prof. Clécio's classes will resume starting on 10/04 (Friday).
Basic Electricity
Lists for P1
List 1
Moysés Nussenzveig, Basic Physics, vol III, 1st Ed.
Chap 2: 1, 5, 7, 8, 9
Chap 21: 1, 3, 20, 21, 23, 27, 34, 37, 42, 45, 72, 86, 91
List 2
Moysés Nussenzveig, Basic Physics, vol III, 1st Ed.
Chap 4: 1, 3, 4, 6, 10
Chap 23: 13, 14, 19, 41, 46, 50, 59, 63, 66, 67, 81, 82, 83
Introduction to Scientific Methodology
Introduction to Scientific Methodology from UFSM Methodology (Unit 1 and 2)Capes Methodology Guide (Unit 1 and 2)
P2:
Seminar topics, groups of up to 4 people - 20 min
Antiscience movements: Flat Earth
Antiscience movements: Anti-Vaccine Movement
Theory of Evolution vs Intelligent Design
From the scientific method and reproducible observations, discuss Heliocentrism.
Genetically modified foods under the lens of science: are they harmful?
Practices that claim scientific support: Homeopathy
Practices that claim scientific support: Quantum Healing
Global Warming
Additional material:
Flat Earth - Documentary
Mindwalk - Film
The Name of the Rose - Film
Why is philosophy important in teaching science in universities? - Folha de SP column
There is no such thing as exact science (and let’s agree they’re all human)
Why do I teach Plato to plumbers
Philosophy of science and science education: an analogy - Alberto Villani
2019.1
Deep Learning minicourse: Introduction to Deep Learning in Astronomy from February 11-14 at IAG-USP.
- Basic Electricity
- Introduction to Scientific Methodology
Attention: Due to force majeure, Basic Electricity classes will only resume on 03/29 (Friday).
Note: We have a monitor: GABRIEL MEDEIROS DA CUNHA available from 1:10 PM to 2:50 PM (Tuesday) and from 3:50 PM to 4:30 PM (Thursday). Starting from 03/19. Look for him in block C classrooms.
List 1
Moysés Nussenzveig, Basic Physics, vol III, 1st Ed.
Chap 2: 1, 5, 7, 8, 9
Chap 3: 2, 6, 9, 13, 16
Young and Freedman, Physics III, 14th Ed.
Chap 21: 1, 3, 20, 21, 23, 27, 34, 37, 42, 45, 72, 86, 91
Chap 22: 1, 7, 9, 12, 13, 33, 49, 61, 62
List 2
Moysés Nussenzveig, vol III, 1st Ed.
Chap 4: 1, 3, 4, 6, 10
Chap 5: 2, 4, 9, 10
Young and Freedman, Physics III, 14th Ed.
Chap 23: 13, 14, 19, 41, 46, 50, 59, 63, 66, 67, 81, 82, 83
Chap 24: 4, 12, 40, 45, 64, 66, 67
List 3 (for T1)
Moysés Nussenzveig, vol III, 1st Ed.
Chap 6: 2, 4, 5, 9
Chap 7: 1, 3, 4, 5
Young and Freedman, Physics III, 14th Ed.
Chap 25: 1, 2, 5, 7, 9, 19, 37
Chap 27: 27, 34, 35, 39, 42
Chap 28: 27, 40, 43
List 4 (for T2)
Moysés Nussenzveig, vol III, 1st Ed.
Chap 8: 1, 4, 6, 8, 9, 11
Chap 9: 4, 5, 7, 8, 10
Important dates (Updated on 06/20): 06/26 - T2 Mechanics class, 06/27 - T2 Production, 06/28 - P3 Production, 07/03 - P3 Mechanics, 07/04 - Final Exam Production, 07/04 - Final Exam Mechanics. P3 and Final Exam cover all content.
Attention: Final deadline for the Scientific Methodology final paper is 06/20. All papers must be presented by 07/04.
Supplementary material:
The Earth is Flat - Documentary
Mindwalk - Film
The Name of the Rose - Film
Why philosophy matters in science education at universities - Folha de SP
There are no exact sciences (and let's agree all are human sciences)
Why do I teach Plato to plumbers
Philosophy of science and science teaching: an analogy - Alberto Villani
Seminar topics (groups up to 4 people - 20 min):
Anti-science movements: The Earth is flat
Anti-science movements: Anti-vaccine movement
Evolution Theory vs Intelligent Design
Based on the scientific method and accessible and reproducible observations, discuss Heliocentrism
GMOs under scientific scrutiny: Are they harmful?
Practices claiming scientific backing: Homeopathy
Practices claiming scientific backing: Quantum Healing
Global Warming
2018.2
- Deep Learning
- Basic Electricity
- Scientific Methodology
- Tim 4
- Tim 2
Deep Learning
Task 1 (classification):
https://bitbucket.org/kognitalab/images_mini_challange/src/master/
Part 2: Evaluate your results by generating a ROC curve and calculating its area.
Part 3: Assess the robustness of your results by randomly redefining the training and test samples at least 10-20 times and computing the curve's mean and standard deviation.
Part 4: Evaluate the overfitting of your network.
Part 5: Add another Conv+ReLu+Maxpool block and repeat step 3.
Task 2 (regression):
Where is Waldo?
Goal: Given a set of images, build a network to identify the location of an object by defining 4 parameters: xmin, ymin, xmax, ymax.
The data is available at: /home/Dados/GeoSimula
Evaluate your results using a completeness vs centroid distance plot. Calculate the area of this figure and produce some example images with bounding boxes.
If you really want to find Waldo, go to: findingwally.pythonanywhere.com
Task 3 (Do it yourself!):
Goal: Without Keras or Tensorflow, implement and train a network with a few layers using only Python.
Use the example from: https://machinelearningmastery.com/implement-backpropagation-algorithm-scratch-python/
Challenge: Implement a convolutional layer and create a notebook using your example.
Part 2: Compare your results with an identical network built in Keras/Tensorflow.
Task 4:
Apply an architecture used in Task 01 (with proper adaptations) — like ResNet, ResNeXt, Inception — to the Strong Lensing Challenge dataset...
Task 5 - Time Series:
1 - Follow the tutorial at: aboveintelligent.com - LSTM Tutorial
2 - Follow the tutorial at: Kaggle - LSTM Time Series Tutorial
3 - Implement an LSTM network for the rainfall classification challenge: Kaggle - How much did it rain?
Build your own implementation of the winning solution: GitHub - danzelmo
and compare its performance using the evaluation metric: Kaggle - Evaluation
Task 6 - AE vs PCA:
Demonstrate that PCA is equivalent to an AutoEncoder with linear functions for a specific choice of loss function.
Recommended reading:
Introduction to Convolutional Neural Networks - Wu
A Beginner’s Guide To Understanding Convolutional Neural Networks
The Deep Learning Book
Suggested tools:
Tensorflow, Anaconda Python 2.X or 3.X, Keras, Jupyter
Additional reading:
Planes don’t flap their wings: does AI work like a brain?
Important dates:
Project definition deadline - 29/08
Work summary submission - 26/09
Seminars with preliminary results - 24/10 and 31/10
Updated code on GitHub - 14/12
NT submitted - 14/12
The final course project must be submitted (at least) to the Notas Técnicas journal: http://revistas.cbpf.br/index.php/nt/about/submissions#authorGuidelines
Basic Electricity (Production Engineering)
List 1
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 2: 1, 5, 7, 8, 9
Ch. 3: 2, 6, 9, 13, 16
List 2
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 4: 1, 3, 4, 6, 10
Ch. 5: 2, 4, 9, 10
Test 1 → 09/14
List 3 (for T1)
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 6: 2, 4, 5, 9
Ch. 7: 1, 3, 4, 5
List 4 (for T2)
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 8: 1, 4, 6, 8, 9, 11
teaching.list_5
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 9: 4, 5, 7, 8, 10
Extra List - Challenge
(worth 1.5 extra pts if delivered on the P2 day, correct and legible)
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 10: 2, 4, 8
Ch. 12: 3, 4
Attention: Calendar changed!
P2 → 11/30
PR → 12/07
PF → 12/14
Notice: No class on 10/05 (Friday) and 10/26. No class on 11/22. Request your Test 1 grade by email.
Introduction to Scientific Methodology (Production Engineering)
Supplementary material:
Why is philosophy important in science education at universities? - Folha de SP
There is no exact science (and let's agree all are human)
Why do I teach Plato to plumbers
Philosophy of science and Science Teaching: an analogy - Alberto Villani
Suggested debate topics:
Galileo and Giordano Bruno
Intelligent design vs. Evolution
Flat Earth vs. Round Earth
Did man go to the Moon?
Do cell phones cause cancer?
Are GMOs harmful?
Do vaccines cause more harm than good?
Rules:
Arguments based on inaccessible technology are not allowed... All claims must be sourced and demonstrable.
Debate dates:
10/04 - Flat Earth / Moon landing
10/11 - GMOs / Intelligent Design
10/18 - Vaccines / Cell phone cancer
Tim 4
list01
list02 - Gravitation
Note: In some exercises, you may use online values such as:
https://pt.wikipedia.org/wiki/Terra
https://pt.wikipedia.org/wiki/Lua
Notice: No classes on 10/05 (Friday) and 10/26 (Friday).