pandas 專題:奧運獎牌排行

郭耀仁

來源

CourseraIntroduction to Data Science in Python 課程第二週作業

資料

In [1]:
import pandas as pd

df = pd.read_csv('https://storage.googleapis.com/py_ml_datasets/olympics.csv', index_col=0, skiprows=1)

for col in df.columns:
    if col[:2]=='01':
        df.rename(columns={col:'Gold'+col[4:]}, inplace=True)
    if col[:2]=='02':
        df.rename(columns={col:'Silver'+col[4:]}, inplace=True)
    if col[:2]=='03':
        df.rename(columns={col:'Bronze'+col[4:]}, inplace=True)
    if col[:1]=='№':
        df.rename(columns={col:'#'+col[1:]}, inplace=True)

names_ids = df.index.str.split('\s\(') # split the index by '('

df.index = names_ids.str[0] # the [0] element is the country name (new index) 
df['ID'] = names_ids.str[1].str[:3] # the [1] element is the abbreviation or ID (take first 3 characters from that)

df = df.drop('Totals')
In [2]:
df.head()
Out[2]:
# Summer Gold Silver Bronze Total # Winter Gold.1 Silver.1 Bronze.1 Total.1 # Games Gold.2 Silver.2 Bronze.2 Combined total ID
Afghanistan 13 0 0 2 2 0 0 0 0 0 13 0 0 2 2 AFG
Algeria 12 5 2 8 15 3 0 0 0 0 15 5 2 8 15 ALG
Argentina 23 18 24 28 70 18 0 0 0 0 41 18 24 28 70 ARG
Armenia 5 1 2 9 12 6 0 0 0 0 11 1 2 9 12 ARM
Australasia 2 3 4 5 12 0 0 0 0 0 2 3 4 5 12 ANZ

問題一

哪個國家贏得的夏季奧運金牌數最多?

def answer_one():
    '''
    這個函數應該回傳一個文字,國家名
    '''
    return "答案"

問題二

哪個國家夏季奧運與冬季奧運的金牌數差距最大?

def answer_two():
    '''
    這個函數應該回傳一個文字,國家名
    '''
    return "答案"

問題三

哪個國家夏季奧運與冬季奧運的金牌數差距除以總金牌數的比例最大?(僅考慮至少有一個夏季金牌與一個冬季金牌的國家)

$$\frac{\text{Summer Gold} - \text{Winter Gold}}{\text{Total Gold}}$$
def answer_three():
    '''
    這個函數應該回傳一個文字,國家名
    '''
    return "答案"

問題四

計算 146 個國家的獎牌點數,其中金牌 3 點、銀牌 2 點、銅牌 1 點。

def answer_four():
    '''
    這個函數應該回傳一個 Series,長度為 146
    '''
    return "答案"

參考解答

In [3]:
def answer_one():
    max_idx = df['Gold'].idxmax()
    return max_idx

answer_one()
Out[3]:
'United States'
In [4]:
def answer_two():
    diff = df['Gold'] - df['Gold.1']
    diff_abs = diff.abs()
    max_idx = diff_abs.idxmax()
    return max_idx

answer_two()
Out[4]:
'United States'
In [5]:
def answer_three():
    at_least_one_gold = (df['Gold'] >= 1) & (df['Gold.1'] >= 1)
    df_filtered = df[at_least_one_gold]
    diff = df_filtered['Gold'] - df_filtered['Gold.1']
    diff_abs = diff.abs()
    ratio = diff_abs / df_filtered['Gold.2']
    max_idx = ratio.idxmax()
    return max_idx

answer_three()
Out[5]:
'Bulgaria'
In [6]:
def answer_four():
    points = df['Gold.2'] * 3 + df['Silver.2'] * 2 + df['Bronze.2']
    return points

answer_four()
Out[6]:
Afghanistan                            2
Algeria                               27
Argentina                            130
Armenia                               16
Australasia                           22
Australia                            923
Austria                              569
Azerbaijan                            43
Bahamas                               24
Bahrain                                1
Barbados                               1
Belarus                              154
Belgium                              276
Bermuda                                1
Bohemia                                5
Botswana                               2
Brazil                               184
British West Indies                    2
Bulgaria                             411
Burundi                                3
Cameroon                              12
Canada                               846
Chile                                 24
China                               1120
Colombia                              29
Costa Rica                             7
Ivory Coast                            2
Croatia                               67
Cuba                                 420
Cyprus                                 2
                                    ... 
Spain                                268
Sri Lanka                              4
Sudan                                  2
Suriname                               4
Sweden                              1217
Switzerland                          630
Syria                                  6
Chinese Taipei                        32
Tajikistan                             4
Tanzania                               4
Thailand                              44
Togo                                   1
Tonga                                  2
Trinidad and Tobago                   27
Tunisia                               19
Turkey                               191
Uganda                                14
Ukraine                              220
United Arab Emirates                   3
United States                       5684
Uruguay                               16
Uzbekistan                            38
Venezuela                             18
Vietnam                                4
Virgin Islands                         2
Yugoslavia                           171
Independent Olympic Participants       4
Zambia                                 3
Zimbabwe                              18
Mixed team                            38
Length: 146, dtype: int64