Ciência de Dados - Conteúdo Programático

 

Ementa

 

1.Introdução (Motivação)

 

1.1.O que é Ciência de Dados (Data Science) ?

-Conceitos

-Que profissão é essa

-Mercado de trabalho, Habilidades

 

1.2. Linguagens usadas em Data Science

- Python, R, Scala, Java

 

1.3 - Curso Rápido de Python para Análise de Dados

 

2.Data Science e Big Data

-Introdução

-O que é Big Data

-Modelos de Computação Distribuída

 

3.Machine Learning (Aprendizado de Máquina)

(Aprendizado Supervisionado e Aprendizado não-Supervisionado)

Fundamentação Teórica

Regressão Linear

Regressão Logística

KNN

Naive Bayes

Clusterização

PCA

Árvores de Decisão

SVM

4. Redes Neurais

4.1. Componentes Básicos das Redes Neurais Artificiais - RNAs 
4.2. O Sistema Nervoso 
4.3. O Neurônio 
4.4. Sinapses 
4.5. Neurônio x Redes Neurais Artificiais 
4.6. Arquitetura 
4.7. O aprendizado 
4.8. Redes Perceptron 

4.9. Redes ADALINE – Regra Delta 
4.10. Perceptron Multicamadas - MLP 
4.11. Algoritmo BackPropagation 
4.12. Backpropagation – Ajuste de parâmetros
4.13. Otimizadores – Regra Delta 

 

5.Deep Learning

5.1. Introdução e conceitos básicos

5.2. A Arquitetura das Redes Neurais

5.3. Os  principais tipos de Redes Neurais

5.4. Frameworks de Deep Learning

5.5. Programação paralela em GPU

5.6 TensorFlow

5.7 Keras

6. Análise de Mídias Sociais

Twitter, Facebook, Youtube, Instagram, Spotify, etc

 

7. Hadoop e Engenharia de Dados

 

8.  Spark

 

9. Bancos de Dados NoSQL

 

10. Introdução ao MongoDB

11. Processamento de Linguagem Natural (PLN)

PLN e Análise de texto

- Datasets e Arquivos .py

___________________________________________________________________________________

Arquivos no GitHub 

 

 

 

 

 

___________________________________________________________________________________

DICAS:

Pesquisadores - Links

Geoffrey E. Hinton
Department of Computer Science
University of Toronto

Geoffrey Hinton received his PhD in Artificial Intelligence from
Edinburgh in 1978. After five years as a faculty member at
Carnegie-Mellon he became a fellow of the Canadian Institute for
Advanced Research and moved to the Department of Computer Science at
the University of Toronto where he is now an Emeritus Distinguished
Professor. He is also a Vice President & Engineering Fellow at Google
and Chief Scientific Adviser of the Vector Institute.

Geoffrey Hinton was one of the researchers who introduced the
backpropagation algorithm and the first to use backpropagation for
learning word embeddings. His other contributions to neural network
research include Boltzmann machines, distributed representations,
time-delay neural nets, mixtures of experts, variational learning and
deep learning. His research group in Toronto made major breakthroughs
in deep learning that revolutionized speech recognition and object
classification.

Geoffrey Hinton is a fellow of the UK Royal Society, a foreign member
of the US National Academy of Engineering and a foreign member of the
American Academy of Arts and Sciences. His awards include the David
E. Rumelhart prize, the IJCAI award for research excellence, the
Killam prize for Engineering, the IEEE Frank Rosenblatt medal, the
IEEE James Clerk Maxwell Gold medal, the NEC C&C award, the BBVA
award, and the NSERC Herzberg Gold Medal which is Canada's top award
in Science and Engineering

https://www.cs.toronto.edu/~hinton/


===

Sebastian Raschka
Sebastian Raschka is a machine learning researcher developing new deep learning architectures to solve problems in the field of biometrics with a focus on face recognition and privacy protection. Among others, his research activities include applications of machine learning to solve problems in (computational) biology. After receiving his doctorate from Michigan State University, Sebastian recently joined the University of Wisconsin-Madison as Assistant Professor of Statistics.
https://sebastianraschka.com/about.html

===

Ian Goodfellow - Google 
Ian J. Goodfellow (born 1985 or 1986) is a researcher working in machine learning, currently employed at Apple Inc. as its director of machine learning in the Special Projects Group.
He was previously employed as a research scientist at Google Brain. He has made several contributions to the field of deep learning.

http://www.iangoodfellow.com/

===

ANDREW NG
Andrew Ng is VP & Chief Scientist of Baidu; Co-Chairman and Co-Founder of Coursera; and an Adjunct Professor at Stanford University. 
https://www.andrewng.org/

===
Deep Learning Book - Gratuito
https://www.deeplearningbook.org/

===

Livro Mathematics for Machine Learning disponível gratuitamente em pdf:
https://mml-book.github.io/

Table of Contents

Part I: Mathematical Foundations
Introduction and Motivation
Linear Algebra
Analytic Geometry
Matrix Decompositions
Vector Calculus
Probability and Distribution
Continuous Optimization

Part II: Central Machine Learning Problems
When Models Meet Data
Linear Regression
Dimensionality Reduction with Principal Component Analysis
Density Estimation with Gaussian Mixture Models
Classification with Support Vector Machines

___________________________________________________________________________________

Curso de Inglês (gratuito)
http://isf.mec.gov.br/
Curso Gratuito de Inglês - MyEnglishOnLine

Curso de Inglês
https://www.italki.com/

________________________________________________________________________________________________________________

CURSOS GRATUITOS:
 

Seguem os cursos gratuidos sobre Data Science, Big Data e Inteligênca Artificial. Vale a pena fazê-los.

Cursos Gratuitos (com certificados):


Cusos da Data Sciency Academy - DSA 

A. Introdução à Ciência de Dados

https://www.datascienceacademy.com.br/public-course?courseid=introduo--cincia-de-dados

B. Big Data Fundamentos

https://www.datascienceacademy.com.br/public-course?courseid=big-data-fundamentos

C. Inteligência Artificial - Fundamentos

https://www.datascienceacademy.com.br/path-player?courseid=inteligencia-artificial-fundamentos&unit=5b4ed2885e4cdee2138b456eUnit

D. Python Fundamentos para Análise de Dados

https://www.datascienceacademy.com.br/public-course?courseid=python-fundamentos

E. Power Bi para Data Science

https://www.datascienceacademy.com.br/path-player?courseid=microsoft-power-bi-para-data-science

________________________________________________________________________________________________________________

Curso de Álgebra Linear para Cientista de dados

Por Que Você Deve Aprender Álgebra Linear Para Trabalhar com Machine Learning?

Introdução à Estatística

Estatística Descritiva

Cálculo - Derivadas

Gradiente Descendente

 

Links de Cientistas de Dados:

Geoffrey Hinton, Toronto, Canadá.

Ian Goodfellow, Google Brain, USA

Sebastian Raschka,  University of Wisconsin-Madison, USA

Andrew Ng, Co-founder of Coursera

Kira Radinsky, eBay

Hilary Mason, NY, USA (bit.ly)

 

 

LInks Redes Neurais e Deep Learning:

Animação Perceptron

Animação MLP

TensorFlow (Deep Learning)

 

Big Data Datasets:

https://archive.ics.uci.edu (Esse repositório é rico em datasets e  é da Universidade de Califórnia)
http://mariofilho.com/onde-encontrar-datasets-para-praticar-data-science-e-machine-learning/
https://mineracaodedados.wordpress.com/tag/datasets/
https://www.quandl.com/
http://www.datasciencecentral.com/profiles/blogs/big-data-sets-available-for-free
http://www.datasciencecentral.com/profiles/blogs/20-free-big-data-sources-everyone-should-check-out
http://www.datasciencecentral.com/profiles/blogs/10-great-healthcare-data-sets
http://www.bigdatanews.com/profiles/blogs/another-large-data-set-250-million-data-points-available-for-down