diff --git a/report_01_交通事故理赔审核预测/.ipynb_checkpoints/report_template-checkpoint.ipynb b/report_01_交通事故理赔审核预测/.ipynb_checkpoints/report_template-checkpoint.ipynb new file mode 100644 index 0000000..21eb1ce --- /dev/null +++ b/report_01_交通事故理赔审核预测/.ipynb_checkpoints/report_template-checkpoint.ipynb @@ -0,0 +1,57 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Report - 报告题目\n", + "\n", + "* 姓名\n", + "* 学号\n", + "\n", + "\n", + "## 任务简介\n", + "\n", + "这里简述一下任务是什么;数据的格式,包含了什么数据;最终的目标是什么\n", + "\n", + "## 解决途径\n", + "\n", + "主要包括:\n", + "1. 问题的思考,整体的思路\n", + "2. 选用的方法,以及为何选用这些方法\n", + "3. 实现过程遇到的问题,以及如何解决的\n", + "4. 最终的结果,实验分析\n", + "\n", + "要求:\n", + "1. 数据的可视化\n", + "2. 程序,以及各个部分的解释、说明\n", + "3. 结果的可视化,精度等的分析\n", + "\n", + "## 总结\n", + "总结任务实现过程所取得的心得等。" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + }, + "main_language": "python" + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/report_01_交通事故理赔审核预测/README.md b/report_01_交通事故理赔审核预测/README.md new file mode 100644 index 0000000..4253063 --- /dev/null +++ b/report_01_交通事故理赔审核预测/README.md @@ -0,0 +1,23 @@ +# Exercise - 交通事故理赔审核预测 + +## 内容: +* 任务类型:二元分类 + +* 背景介绍:在交通摩擦(事故)发生后,理赔员会前往现场勘察、采集信息,这些信息往往影响着车主是否能够得到保险公司的理赔。训练集数据包括理赔人员在现场对该事故方采集的36条信息,信息已经被编码,以及该事故方最终是否获得理赔。我们的任务是根据这36条信息预测该事故方没有被理赔的概率。 + +* 数据介绍:训练集中共有200000条样本,预测集中有80000条样本。 +![data_description](images/data_description.png) + +* 评价方法:Precision-Recall AUC + + +## 要求: +1. 编写程序,初步完成分类 +2. 在竞赛网站上注册,提交结果 +3. 分析结果的效果,综合考虑各种方法,改进方法,并提交结果 +4. 按照`report_template.ipynb`撰写自己的报告 + + +## References +* [这个数据的链接](http://sofasofa.io/competition.php?id=2) + diff --git a/report_01_交通事故理赔审核预测/Exercise - 交通事故理赔审核预测.ipynb b/report_01_交通事故理赔审核预测/data_tutorial.ipynb similarity index 100% rename from report_01_交通事故理赔审核预测/Exercise - 交通事故理赔审核预测.ipynb rename to report_01_交通事故理赔审核预测/data_tutorial.ipynb diff --git a/report_01_交通事故理赔审核预测/Exercise - 交通事故理赔审核预测.py b/report_01_交通事故理赔审核预测/data_tutorial.py similarity index 100% rename from report_01_交通事故理赔审核预测/Exercise - 交通事故理赔审核预测.py rename to report_01_交通事故理赔审核预测/data_tutorial.py diff --git a/report_01_交通事故理赔审核预测/report_template.ipynb b/report_01_交通事故理赔审核预测/report_template.ipynb new file mode 100644 index 0000000..21eb1ce --- /dev/null +++ b/report_01_交通事故理赔审核预测/report_template.ipynb @@ -0,0 +1,57 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Report - 报告题目\n", + "\n", + "* 姓名\n", + "* 学号\n", + "\n", + "\n", + "## 任务简介\n", + "\n", + "这里简述一下任务是什么;数据的格式,包含了什么数据;最终的目标是什么\n", + "\n", + "## 解决途径\n", + "\n", + "主要包括:\n", + "1. 问题的思考,整体的思路\n", + "2. 选用的方法,以及为何选用这些方法\n", + "3. 实现过程遇到的问题,以及如何解决的\n", + "4. 最终的结果,实验分析\n", + "\n", + "要求:\n", + "1. 数据的可视化\n", + "2. 程序,以及各个部分的解释、说明\n", + "3. 结果的可视化,精度等的分析\n", + "\n", + "## 总结\n", + "总结任务实现过程所取得的心得等。" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + }, + "main_language": "python" + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/report_01_交通事故理赔审核预测/report_template.py b/report_01_交通事故理赔审核预测/report_template.py new file mode 100644 index 0000000..8bb2385 --- /dev/null +++ b/report_01_交通事故理赔审核预测/report_template.py @@ -0,0 +1,45 @@ +# -*- coding: utf-8 -*- +# --- +# jupyter: +# jupytext_format_version: '1.2' +# kernelspec: +# display_name: Python 3 +# language: python +# name: python3 +# language_info: +# codemirror_mode: +# name: ipython +# version: 3 +# file_extension: .py +# mimetype: text/x-python +# name: python +# nbconvert_exporter: python +# pygments_lexer: ipython3 +# version: 3.5.2 +# --- + +# # Report - 报告题目 +# +# * 姓名 +# * 学号 +# +# +# ## 任务简介 +# +# 这里简述一下任务是什么;数据的格式,包含了什么数据;最终的目标是什么 +# +# ## 解决途径 +# +# 主要包括: +# 1. 问题的思考,整体的思路 +# 2. 选用的方法,以及为何选用这些方法 +# 3. 实现过程遇到的问题,以及如何解决的 +# 4. 最终的结果,实验分析 +# +# 要求: +# 1. 数据的可视化 +# 2. 程序,以及各个部分的解释、说明 +# 3. 结果的可视化,精度等的分析 +# +# ## 总结 +# 总结任务实现过程所取得的心得等。 diff --git a/report_03_Titanic/.ipynb_checkpoints/Titanic-checkpoint.ipynb b/report_02_Titanic/.ipynb_checkpoints/Titanic-checkpoint.ipynb similarity index 100% rename from report_03_Titanic/.ipynb_checkpoints/Titanic-checkpoint.ipynb rename to report_02_Titanic/.ipynb_checkpoints/Titanic-checkpoint.ipynb diff --git a/report_02_Titanic/README.md b/report_02_Titanic/README.md new file mode 100644 index 0000000..a78ef3b --- /dev/null +++ b/report_02_Titanic/README.md @@ -0,0 +1,39 @@ +# Titanic + +## Competition Description +The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. + +One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class. + +In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy. + +## Practice Skills +* Binary classification +* Python & SKLearn + +## Data +The data ziped into 'data.zip', please first extract it. There are two groups: + +* training set (train.csv) +* test set (test.csv) + +The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the `ground truth`) for each passenger. Your model will be based on `features` like passengers' gender and class. You can also use feature engineering to create new features. + +The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic. + +We also include `gender_submission.csv`, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like. + +### Data description +![data description1](images/data_description1.png) +![data description2](images/data_description2.png) + + +### Variable Notes +pclass: A proxy for socio-economic status (SES) +* 1st = Upper +* 2nd = Middle +* 3rd = Lower + + +## Links +* [Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic) diff --git a/report_03_Titanic/data.zip b/report_02_Titanic/data.zip similarity index 100% rename from report_03_Titanic/data.zip rename to report_02_Titanic/data.zip diff --git a/report_03_Titanic/images/data_description1.png b/report_02_Titanic/images/data_description1.png similarity index 100% rename from report_03_Titanic/images/data_description1.png rename to report_02_Titanic/images/data_description1.png diff --git a/report_03_Titanic/images/data_description2.png b/report_02_Titanic/images/data_description2.png similarity index 100% rename from report_03_Titanic/images/data_description2.png rename to report_02_Titanic/images/data_description2.png diff --git a/report_02_Titanic/report_template.ipynb b/report_02_Titanic/report_template.ipynb new file mode 100644 index 0000000..21eb1ce --- /dev/null +++ b/report_02_Titanic/report_template.ipynb @@ -0,0 +1,57 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Report - 报告题目\n", + "\n", + "* 姓名\n", + "* 学号\n", + "\n", + "\n", + "## 任务简介\n", + "\n", + "这里简述一下任务是什么;数据的格式,包含了什么数据;最终的目标是什么\n", + "\n", + "## 解决途径\n", + "\n", + "主要包括:\n", + "1. 问题的思考,整体的思路\n", + "2. 选用的方法,以及为何选用这些方法\n", + "3. 实现过程遇到的问题,以及如何解决的\n", + "4. 最终的结果,实验分析\n", + "\n", + "要求:\n", + "1. 数据的可视化\n", + "2. 程序,以及各个部分的解释、说明\n", + "3. 结果的可视化,精度等的分析\n", + "\n", + "## 总结\n", + "总结任务实现过程所取得的心得等。" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + }, + "main_language": "python" + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/report_03_Fashion/README.md b/report_03_Fashion/README.md new file mode 100644 index 0000000..cbfd712 --- /dev/null +++ b/report_03_Fashion/README.md @@ -0,0 +1,24 @@ +# Report3 - 服装分类 + +## 内容: +* 任务类型:多分类 + +* 背景介绍:在交通摩擦(事故)发生后,理赔员会前往现场勘察、采集信息,这些信息往往影响着车主是否能够得到保险公司的理赔。训练集数据包括理赔人员在现场对该事故方采集的36条信息,信息已经被编码,以及该事故方最终是否获得理赔。我们的任务是根据这36条信息预测该事故方没有被理赔的概率。 + +* 数据介绍:训练集中共有200000条样本,预测集中有80000条样本。 +![data_description](images/data_description.png) + +* 评价方法:Precision-Recall AUC + + +## 要求: +1. 构建深度神经网络,完成多分类 +2. 编写爬虫程序,到taobao等网站抓取一些衣服、鞋子的图片,并利用训练好的模型进行分类 +3. 评估自己抓取图像的分类精度 +4. 分析结果的效果,综合考虑各种方法,改进方法,并提交结果 +5. 按照`report_template.ipynb`撰写自己的报告 + + +## References +* [这个数据的链接](http://sofasofa.io/competition.php?id=2) + diff --git a/report_03_Fashion/fashion-mnist.zip b/report_03_Fashion/fashion-mnist.zip new file mode 100644 index 0000000..5a9530d Binary files /dev/null and b/report_03_Fashion/fashion-mnist.zip differ diff --git a/report_03_Fashion/report_template.ipynb b/report_03_Fashion/report_template.ipynb new file mode 100644 index 0000000..21eb1ce --- /dev/null +++ b/report_03_Fashion/report_template.ipynb @@ -0,0 +1,57 @@ +{ + "cells": [ + { + "cell_type": "markdown", + "metadata": {}, + "source": [ + "# Report - 报告题目\n", + "\n", + "* 姓名\n", + "* 学号\n", + "\n", + "\n", + "## 任务简介\n", + "\n", + "这里简述一下任务是什么;数据的格式,包含了什么数据;最终的目标是什么\n", + "\n", + "## 解决途径\n", + "\n", + "主要包括:\n", + "1. 问题的思考,整体的思路\n", + "2. 选用的方法,以及为何选用这些方法\n", + "3. 实现过程遇到的问题,以及如何解决的\n", + "4. 最终的结果,实验分析\n", + "\n", + "要求:\n", + "1. 数据的可视化\n", + "2. 程序,以及各个部分的解释、说明\n", + "3. 结果的可视化,精度等的分析\n", + "\n", + "## 总结\n", + "总结任务实现过程所取得的心得等。" + ] + } + ], + "metadata": { + "kernelspec": { + "display_name": "Python 3", + "language": "python", + "name": "python3" + }, + "language_info": { + "codemirror_mode": { + "name": "ipython", + "version": 3 + }, + "file_extension": ".py", + "mimetype": "text/x-python", + "name": "python", + "nbconvert_exporter": "python", + "pygments_lexer": "ipython3", + "version": "3.5.2" + }, + "main_language": "python" + }, + "nbformat": 4, + "nbformat_minor": 2 +} diff --git a/report_03_Titanic/Titanic.ipynb b/report_03_Titanic/Titanic.ipynb deleted file mode 100644 index fa1ba20..0000000 --- a/report_03_Titanic/Titanic.ipynb +++ /dev/null @@ -1,71 +0,0 @@ -{ - "cells": [ - { - "cell_type": "markdown", - "metadata": {}, - "source": [ - "# Titanic\n", - "\n", - "## Competition Description\n", - "The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships.\n", - "\n", - "One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class.\n", - "\n", - "In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy.\n", - "\n", - "## Practice Skills\n", - "* Binary classification\n", - "* Python & SKLearn\n", - "\n", - "## Data\n", - "The data has been split into two groups:\n", - "\n", - "* training set (train.csv)\n", - "* test set (test.csv)\n", - "\n", - "The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the `ground truth`) for each passenger. Your model will be based on `features` like passengers' gender and class. You can also use feature engineering to create new features.\n", - "\n", - "The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic.\n", - "\n", - "We also include `gender_submission.csv`, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like.\n", - "\n", - "### Data description\n", - "![data description1](images/data_description1.png)\n", - "![data description2](images/data_description2.png)\n", - "\n", - "\n", - "### Variable Notes\n", - "pclass: A proxy for socio-economic status (SES)\n", - "* 1st = Upper\n", - "* 2nd = Middle\n", - "* 3rd = Lower\n", - "\n", - "\n", - "## Links\n", - "* [Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic)" - ] - } - ], - "metadata": { - "kernelspec": { - "display_name": "Python 3", - "language": "python", - "name": "python3" - }, - "language_info": { - "codemirror_mode": { - "name": "ipython", - "version": 3 - }, - "file_extension": ".py", - "mimetype": "text/x-python", - "name": "python", - "nbconvert_exporter": "python", - "pygments_lexer": "ipython3", - "version": "3.5.2" - }, - "main_language": "python" - }, - "nbformat": 4, - "nbformat_minor": 2 -} diff --git a/report_03_Titanic/Titanic.py b/report_03_Titanic/Titanic.py deleted file mode 100644 index b66aa59..0000000 --- a/report_03_Titanic/Titanic.py +++ /dev/null @@ -1,58 +0,0 @@ -# --- -# jupyter: -# jupytext_format_version: '1.2' -# kernelspec: -# display_name: Python 3 -# language: python -# name: python3 -# language_info: -# codemirror_mode: -# name: ipython -# version: 3 -# file_extension: .py -# mimetype: text/x-python -# name: python -# nbconvert_exporter: python -# pygments_lexer: ipython3 -# version: 3.5.2 -# --- - -# # Titanic -# -# ## Competition Description -# The sinking of the RMS Titanic is one of the most infamous shipwrecks in history. On April 15, 1912, during her maiden voyage, the Titanic sank after colliding with an iceberg, killing 1502 out of 2224 passengers and crew. This sensational tragedy shocked the international community and led to better safety regulations for ships. -# -# One of the reasons that the shipwreck led to such loss of life was that there were not enough lifeboats for the passengers and crew. Although there was some element of luck involved in surviving the sinking, some groups of people were more likely to survive than others, such as women, children, and the upper-class. -# -# In this challenge, we ask you to complete the analysis of what sorts of people were likely to survive. In particular, we ask you to apply the tools of machine learning to predict which passengers survived the tragedy. -# -# ## Practice Skills -# * Binary classification -# * Python & SKLearn -# -# ## Data -# The data has been split into two groups: -# -# * training set (train.csv) -# * test set (test.csv) -# -# The training set should be used to build your machine learning models. For the training set, we provide the outcome (also known as the `ground truth`) for each passenger. Your model will be based on `features` like passengers' gender and class. You can also use feature engineering to create new features. -# -# The test set should be used to see how well your model performs on unseen data. For the test set, we do not provide the ground truth for each passenger. It is your job to predict these outcomes. For each passenger in the test set, use the model you trained to predict whether or not they survived the sinking of the Titanic. -# -# We also include `gender_submission.csv`, a set of predictions that assume all and only female passengers survive, as an example of what a submission file should look like. -# -# ### Data description -# ![data description1](images/data_description1.png) -# ![data description2](images/data_description2.png) -# -# -# ### Variable Notes -# pclass: A proxy for socio-economic status (SES) -# * 1st = Upper -# * 2nd = Middle -# * 3rd = Lower -# -# -# ## Links -# * [Titanic: Machine Learning from Disaster](https://www.kaggle.com/c/titanic)