<aside> ๐Ÿค”

ํŽ˜๋ฅด์†Œ๋‚˜๋ฅผ ์œ„ํ•œ ๋ฐ์ดํ„ฐ ๋ถ„์„ ๊ณผ์ •

</aside>

new-mid_50k์˜ ์žฅ๋ฐ”๊ตฌ๋‹ˆ

# ์กฐ๊ฑด: 'price_band'๊ฐ€ 'mid_50k_150k'์ด๊ณ  'user_type'์ด 'new'์ธ ๋ฐ์ดํ„ฐ ํ•„ํ„ฐ๋ง
mid_price_new_df = merged_df[
    (merged_df['price_band'] == 'mid_50k_150k') &
    (merged_df['user_type'] == 'new')
]

# traffic_source๋ณ„ add_to_cart (Yes/No) ๊ฐœ์ˆ˜ ์ง‘๊ณ„
grouped_cart = (
    mid_price_new_df
    .groupby(['traffic_source', 'add_to_cart'])
    .size()
    .unstack(fill_value=0)
    .reindex(columns=['Yes', 'No'])  # ๋Œ€์†Œ๋ฌธ์ž ์ •ํ™•ํžˆ ๋งž์ถฐ์•ผ ํ•จ!
    .reset_index()
)
print(grouped_cart)

add_to_cart traffic_source  Yes  No
0                       ad   12% 87%
1                  organic   17% 83%
2                   search   23% 77%

์žฅ๋ฐ”๊ตฌ๋‹ˆ no ์ธ ์‚ฌ๋žŒ ์ค‘ ํ• ์ธ ๋…ธ์ถœ (new)

import pandas as pd

# CSV ํŒŒ์ผ ๋ถˆ๋Ÿฌ์˜ค๊ธฐ
merged_df = pd.read_csv("merged_version.csv")

# 1. 'mid_50k_150k' + 'new' ์กฐ๊ฑด ํ•„ํ„ฐ๋ง
mid_price_new_df = merged_df[
    (merged_df['price_band'] == 'mid_50k_150k') &
    (merged_df['user_type'] == 'new')
]

# 2. add_to_cart๊ฐ€ 'No'์ธ ์‚ฌ๋žŒ๋งŒ
added_to_cart_df = mid_price_new_df[mid_price_new_df['add_to_cart'] == 'No']

# 3. traffic_source๋ณ„ discount_exposed (True/False/NaN ํฌํ•จ) ๊ฐœ์ˆ˜ ์ง‘๊ณ„
discount_counts = (
    added_to_cart_df
    .groupby(['traffic_source', 'discount_exposed'])
    .size()
    .unstack(fill_value=0)
)

# 4. ๋น„์œจ(%) ๊ณ„์‚ฐ
discount_ratios = (discount_counts.T / discount_counts.sum(axis=1)).T * 100
discount_ratios = discount_ratios.round(2).reset_index()

# ๊ฒฐ๊ณผ ์ถœ๋ ฅ
print(discount_ratios)

discount_exposed traffic_source  True  False
0                            ad  55%   44%
1                       organic  60%   40%
2                        search  42%   57%

์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š๋Š” ์‚ฌ๋žŒ์„ ํƒ๊ตฌํ•ด๋ณด์ž.. ์™œ ๊ทธ๋“ค์€ ์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š๋Š” ๊ฑธ๊นŒ?

ad: 87%์˜ ๋น„์œจ๋กœ ์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š์Œ. ๊ทธ ์‚ฌ๋žŒ๋“ค์€ 55% ์ •๋„ ํ• ์ธ ์ •๋ณด์— ๋…ธ์ถœ ๋จ

organic: 83%์˜ ๋น„์œจ๋กœ ์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š์Œ. ๊ทธ ์‚ฌ๋žŒ๋“ค์€ 60%์ •๋„ ํ• ์ธ ์ •๋ณด์— ๋…ธ์ถœ ๋จ

search: 77%์˜ ๋น„์œจ๋กœ ์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š์Œ. ๊ทธ ์‚ฌ๋žŒ๋“ค์€ 42%์ •๋„ ํ• ์ธ ์ •๋ณด์— ๋…ธ์ถœ ๋จ

์žฅ๋ฐ”๊ตฌ๋‹ˆ no ์ธ ์‚ฌ๋žŒ ์ค‘ ๋ฆฌ๋ทฐ ํด๋ฆญ ์—ฌ๋ถ€(new)

# 1. 'mid_50k_150k' ๊ฐ€๊ฒฉ๋Œ€, ์‹ ๊ทœ ์‚ฌ์šฉ์ž ํ•„ํ„ฐ๋ง (์ด๋ฏธ ์žˆ์œผ๋‹ˆ ์ƒ๋žต ๊ฐ€๋Šฅ)
mid_price_new_df = merged_df[
    (merged_df['price_band'] == 'mid_50k_150k') &
    (merged_df['user_type'] == 'new')
]

# 2. ์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š์€ ์‚ฌ์šฉ์ž๋งŒ ํ•„ํ„ฐ
no_cart_df = mid_price_new_df[mid_price_new_df['add_to_cart'] == 'No']

# 3. traffic_source๋ณ„, review_clicked๋ณ„ ์‚ฌ์šฉ์ž ์ˆ˜ ์ง‘๊ณ„
review_click_counts = (
    no_cart_df
    .groupby(['traffic_source', 'review_clicked'])['user_id']
    .count()
    .unstack(fill_value=0)
    .reset_index()
)

print(review_click_counts)

review_clicked traffic_source  False  True
0                          ad   68%   32%
1                     organic   80%   20%
2                      search   79%   21%

์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š์œผ๋ฉด์„œ ๋ฆฌ๋ทฐ ํด๋ฆญ?

ad: 87%์˜ ๋น„์œจ๋กœ ์žฅ๋ฐ”๊ตฌ๋‹ˆ์— ๋‹ด์ง€ ์•Š์Œ. ๊ทธ ์‚ฌ๋žŒ๋“ค์€ 32% ์ •๋„ ๋ฆฌ๋ทฐ ํด๋ฆญ