Teaching

Clécio R. Bom

Exercises:

1. Create a new column from a FITS catalog and save the new catalog

2. Estimate the limiting magnitude of the provided catalog. Use the relation between mag_err and S/N

3. Create your catalog using SExtractor ( astromatic.net/software/sextractor/ ). The command should be:

sextractor file_name.fits -c config.sex -PARAMETERS_NAME sex.param -CATALOG_NAME output.fits -CATALOG_TYPE FITS_1.0 -GAIN 8116 -PIXEL_SCALE 0.06 -SEEING_FWHM 0.1 -MAG_ZEROPOINT 25.6651 -PHOT_FLUXFRAC 0.682 -DEBLEND_MINCONT 0.0015 -DEBLEND_NTHRESH 32

The image file is here FF - HST-Image in f435 band (replace “file_name.fits”). Then compare with catalogs from Molino et al. 2017 , catalog for the given image is presented here

You have to create your sex.param file. Please refer to Source Extraction documentation.

For the lecture on Tuesday, Sept-28, please install:

Arianna Cortesi: http://www.star.bris.ac.uk/~mbt/topcat/ https://aladin.u-strasbg.fr/

4. Derive your Hubble constant from several datasets hubble_fit_data

5. Demonstrate that an Ordinary Least Squares fit is equivalent to a Maximum Likelihood of a Gaussian variable with a linear model

6. Demonstrate that the Maximum Likelihood Method for a Bernoulli variable with a linear model is equivalent to minimizing the cross-entropy between predictions and measured variables.

7. Demonstrate that the minimum of cross-entropy is obtained when the two probabilities are equal.

2020.2

Deep Learning

It is recommended to create a repository such as GitHub/GitLab to publish your solutions. Try building your solutions using Python notebooks. Task solutions can be posted at: https://github.com/CBomDeepLearningClass2020b

The final course project will consist of a seminar of up to 20 minutes (followed by questions) about the work developed, accompanied by a notebook that must be posted in the CBomDeepLearningClass2020b organization.

The following will be considered as grading criteria:

-> Notebook clarity. Didactic code, well documented with rich explanations.

-> Mastery of the topic. What differentiates your model from other possible models? How and why was the architecture defined/considered adequate?

-> Metrics chosen to evaluate the problem. Did the training converge? Was there overfitting? What were the attempts to remove it? How do we know the model works and with what metric?

-> Data treatment.

Notebook submission deadline (hard limit) is 12/11/2020. By then, all must have presented.

Proposed projects for 2020.b

1- Not so few, not so Big. What is the data limit to train and converge gravitational arc identification models (Images)?

2- What is the detection limit for a gravitational lens (images)?

3 - Few shot learning for classification of gravitational lenses (images).

4 - Model for star/galaxy separation (tabular/images).

5 - Uncertainty for regression in gravitational lenses. (inverse modeling in images).

6 - Exoplanet detection https://www.kaggle.com/keplersmachines/kepler-labelled-time-series-data (sequence)

7 - Galaxy morphological classification with Deep Learning https://www.kaggle.com/c/galaxy-zoo-the-galaxy-challenge/data (images)

8 - Transient classification using photometry https://www.kaggle.com/c/PLAsTiCC-2018 (sequence)

9 - Credit card fraud detection https://www.kaggle.com/mlg-ulb/creditcardfraud (tabular)

10 - Pneumonia detection https://www.kaggle.com/paultimothymooney/chest-xray-pneumonia (images)

11 - Cardiac arrest mortality prediction https://www.kaggle.com/andrewmvd/heart-failure-clinical-data/ (tabular)

12 - Wine quality prediction https://www.kaggle.com/uciml/red-wine-quality-cortez-et-al-2009/ (tabular)

13 - Heartbeat classification https://www.kaggle.com/kinguistics/heartbeat-sounds (audio)

14 - Sentiment analysis from audio https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio (audio)

15 - CT segmentation for COVID-19 detection https://www.kaggle.com/andrewmvd/covid19-ct-scans (Images / Semantic segmentation)

Tasks

Task 1 - Diabetes diagnosis Diagnosis dataset

Task 2 - First dense network with MNIST

Task 3 - First CNN with MNIST

Task 4 - Classification in a mini-challenge:

https://bitbucket.org/kognitalab/images_mini_challange/src/master/

Part 2: Evaluate your results by generating a ROC curve and calculating its area.

Part 3: Test the robustness by shuffling training and test sets multiple times and computing average and standard deviation of the ROC.

Part 4: Evaluate the overfitting of your network.

Part 5: Add another Conv+ReLu+Maxpool block and repeat step 3. Did it improve? Any insights?

Task 5 - First kaggle challenge: https://www.kaggle.com/c/titanic/overview/evaluation

Task 6 - Do it yourself:

Goal: Without Keras or Tensorflow, implement and train a network from scratch in Python.

Task 7 - Regression on kaggle: https://www.kaggle.com/c/house-prices-advanced-regression-techniques

1. Build a model with Random Forest

2. Build a model with LightGBM

3. Make an Ensemble

Task 8 - Apply your architecture to the Strong Lensing dataset and compare.

Task 9 - Time series with Deep Learning

1 - Follow the tutorial: aboveintelligent.com https://aboveintelligent.com/time-series-analysis-using-recurrent-neural-networks-lstm-33817fa4c47a

2 - Follow the tutorial: Kaggle https://www.kaggle.com/amirrezaeian/time-series-data-analysis-using-lstm-tutorial

3 - Implement an LSTM for the rainfall classification challenge https://www.kaggle.com/c/how-much-did-it-rain#description
Replicate the winning solution: https://github.com/danzelmo/how-much-did-it-rain
Compare your results using the official metric: https://www.kaggle.com/c/how-much-did-it-rain#evaluation

Task 10 - AE vs PCA

Show that PCA is equivalent to AE with linear functions under a specific loss function.

2019.2

  • Basic Electricity
  • Introduction to Scientific Methodology

Attention: due to force majeure, Prof. Clécio's classes will resume starting on 10/04 (Friday).

Basic Electricity

Lists for P1

List 1
Moysés Nussenzveig, Basic Physics, vol III, 1st Ed.
Chap 2: 1, 5, 7, 8, 9
Chap 21: 1, 3, 20, 21, 23, 27, 34, 37, 42, 45, 72, 86, 91

List 2
Moysés Nussenzveig, Basic Physics, vol III, 1st Ed.
Chap 4: 1, 3, 4, 6, 10
Chap 23: 13, 14, 19, 41, 46, 50, 59, 63, 66, 67, 81, 82, 83

Introduction to Scientific Methodology

Introduction to Scientific Methodology from UFSM Methodology (Unit 1 and 2)Capes Methodology Guide (Unit 1 and 2)

P2:

Seminar topics, groups of up to 4 people - 20 min

Antiscience movements: Flat Earth
Antiscience movements: Anti-Vaccine Movement
Theory of Evolution vs Intelligent Design
From the scientific method and reproducible observations, discuss Heliocentrism.
Genetically modified foods under the lens of science: are they harmful?
Practices that claim scientific support: Homeopathy
Practices that claim scientific support: Quantum Healing
Global Warming

Additional material:

Flat Earth - Documentary
Mindwalk - Film
The Name of the Rose - Film
Why is philosophy important in teaching science in universities? - Folha de SP column
There is no such thing as exact science (and let’s agree they’re all human)
Why do I teach Plato to plumbers
Philosophy of science and science education: an analogy - Alberto Villani

2019.1

Deep Learning minicourse: Introduction to Deep Learning in Astronomy from February 11-14 at IAG-USP.

  • Basic Electricity
  • Introduction to Scientific Methodology

Attention: Due to force majeure, Basic Electricity classes will only resume on 03/29 (Friday).

Note: We have a monitor: GABRIEL MEDEIROS DA CUNHA available from 1:10 PM to 2:50 PM (Tuesday) and from 3:50 PM to 4:30 PM (Thursday). Starting from 03/19. Look for him in block C classrooms.

List 1
Moysés Nussenzveig, Basic Physics, vol III, 1st Ed. Chap 2: 1, 5, 7, 8, 9 Chap 3: 2, 6, 9, 13, 16 Young and Freedman, Physics III, 14th Ed. Chap 21: 1, 3, 20, 21, 23, 27, 34, 37, 42, 45, 72, 86, 91 Chap 22: 1, 7, 9, 12, 13, 33, 49, 61, 62

List 2
Moysés Nussenzveig, vol III, 1st Ed. Chap 4: 1, 3, 4, 6, 10 Chap 5: 2, 4, 9, 10 Young and Freedman, Physics III, 14th Ed. Chap 23: 13, 14, 19, 41, 46, 50, 59, 63, 66, 67, 81, 82, 83 Chap 24: 4, 12, 40, 45, 64, 66, 67

List 3 (for T1)
Moysés Nussenzveig, vol III, 1st Ed. Chap 6: 2, 4, 5, 9 Chap 7: 1, 3, 4, 5 Young and Freedman, Physics III, 14th Ed. Chap 25: 1, 2, 5, 7, 9, 19, 37 Chap 27: 27, 34, 35, 39, 42 Chap 28: 27, 40, 43

List 4 (for T2)
Moysés Nussenzveig, vol III, 1st Ed. Chap 8: 1, 4, 6, 8, 9, 11 Chap 9: 4, 5, 7, 8, 10

Important dates (Updated on 06/20): 06/26 - T2 Mechanics class, 06/27 - T2 Production, 06/28 - P3 Production, 07/03 - P3 Mechanics, 07/04 - Final Exam Production, 07/04 - Final Exam Mechanics. P3 and Final Exam cover all content.

Attention: Final deadline for the Scientific Methodology final paper is 06/20. All papers must be presented by 07/04.

Supplementary material:
The Earth is Flat - Documentary Mindwalk - Film The Name of the Rose - Film Why philosophy matters in science education at universities - Folha de SP There are no exact sciences (and let's agree all are human sciences) Why do I teach Plato to plumbers Philosophy of science and science teaching: an analogy - Alberto Villani

Seminar topics (groups up to 4 people - 20 min):
Anti-science movements: The Earth is flat Anti-science movements: Anti-vaccine movement Evolution Theory vs Intelligent Design Based on the scientific method and accessible and reproducible observations, discuss Heliocentrism GMOs under scientific scrutiny: Are they harmful? Practices claiming scientific backing: Homeopathy Practices claiming scientific backing: Quantum Healing Global Warming

2018.2

  • Deep Learning
  • Basic Electricity
  • Scientific Methodology
  • Tim 4
  • Tim 2

Deep Learning

Task 1 (classification):
https://bitbucket.org/kognitalab/images_mini_challange/src/master/
Part 2: Evaluate your results by generating a ROC curve and calculating its area.
Part 3: Assess the robustness of your results by randomly redefining the training and test samples at least 10-20 times and computing the curve's mean and standard deviation.
Part 4: Evaluate the overfitting of your network.
Part 5: Add another Conv+ReLu+Maxpool block and repeat step 3.

Task 2 (regression):
Where is Waldo?
Goal: Given a set of images, build a network to identify the location of an object by defining 4 parameters: xmin, ymin, xmax, ymax.
The data is available at: /home/Dados/GeoSimula
Evaluate your results using a completeness vs centroid distance plot. Calculate the area of this figure and produce some example images with bounding boxes.
If you really want to find Waldo, go to: findingwally.pythonanywhere.com

Task 3 (Do it yourself!):
Goal: Without Keras or Tensorflow, implement and train a network with a few layers using only Python.
Use the example from: https://machinelearningmastery.com/implement-backpropagation-algorithm-scratch-python/
Challenge: Implement a convolutional layer and create a notebook using your example.
Part 2: Compare your results with an identical network built in Keras/Tensorflow.

Task 4:
Apply an architecture used in Task 01 (with proper adaptations) — like ResNet, ResNeXt, Inception — to the Strong Lensing Challenge dataset...

Task 5 - Time Series:
1 - Follow the tutorial at: aboveintelligent.com - LSTM Tutorial
2 - Follow the tutorial at: Kaggle - LSTM Time Series Tutorial
3 - Implement an LSTM network for the rainfall classification challenge: Kaggle - How much did it rain?
Build your own implementation of the winning solution: GitHub - danzelmo
and compare its performance using the evaluation metric: Kaggle - Evaluation

Task 6 - AE vs PCA:
Demonstrate that PCA is equivalent to an AutoEncoder with linear functions for a specific choice of loss function.

Recommended reading:
Introduction to Convolutional Neural Networks - Wu
A Beginner’s Guide To Understanding Convolutional Neural Networks
The Deep Learning Book

Suggested tools:
Tensorflow, Anaconda Python 2.X or 3.X, Keras, Jupyter

Additional reading:
Planes don’t flap their wings: does AI work like a brain?

Important dates:

Project definition deadline - 29/08
Work summary submission - 26/09
Seminars with preliminary results - 24/10 and 31/10
Updated code on GitHub - 14/12
NT submitted - 14/12

The final course project must be submitted (at least) to the Notas Técnicas journal: http://revistas.cbpf.br/index.php/nt/about/submissions#authorGuidelines

Basic Electricity (Production Engineering)

List 1
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 2: 1, 5, 7, 8, 9 Ch. 3: 2, 6, 9, 13, 16

List 2
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 4: 1, 3, 4, 6, 10 Ch. 5: 2, 4, 9, 10

Test 1 → 09/14

List 3 (for T1)
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 6: 2, 4, 5, 9 Ch. 7: 1, 3, 4, 5

List 4 (for T2)
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 8: 1, 4, 6, 8, 9, 11

teaching.list_5
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 9: 4, 5, 7, 8, 10

Extra List - Challenge
(worth 1.5 extra pts if delivered on the P2 day, correct and legible)
Moysés Nussenzveig, vol III, 1ª Ed.
Ch. 10: 2, 4, 8 Ch. 12: 3, 4

Attention: Calendar changed!

P2 → 11/30 PR → 12/07 PF → 12/14

Notice: No class on 10/05 (Friday) and 10/26. No class on 11/22. Request your Test 1 grade by email.

Introduction to Scientific Methodology (Production Engineering)

Supplementary material:
Why is philosophy important in science education at universities? - Folha de SP
There is no exact science (and let's agree all are human)
Why do I teach Plato to plumbers
Philosophy of science and Science Teaching: an analogy - Alberto Villani

Suggested debate topics:
Galileo and Giordano Bruno Intelligent design vs. Evolution Flat Earth vs. Round Earth Did man go to the Moon? Do cell phones cause cancer? Are GMOs harmful? Do vaccines cause more harm than good?

Rules:
Arguments based on inaccessible technology are not allowed... All claims must be sourced and demonstrable.

Debate dates:
10/04 - Flat Earth / Moon landing 10/11 - GMOs / Intelligent Design 10/18 - Vaccines / Cell phone cancer

Tim 2
list01
list02
list03

Tim 4
list01
list02 - Gravitation
Note: In some exercises, you may use online values such as:
https://pt.wikipedia.org/wiki/Terra
https://pt.wikipedia.org/wiki/Lua

Notice: No classes on 10/05 (Friday) and 10/26 (Friday).