I'm struggling with a Pandas problem. I have the following data.
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
| symbol | side | status | origQty | executedQty | qty | availableQty | price | boughtValue | soldValue | dcaLevel |
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
| DGDBTC | BUY | FILLED | 0.125 | 0.12500000 | 0.12500000 | 0.12500000 | 0.02000700 | 0.00250088 | | |
| DGDBTC | BUY | FILLED | 0.125 | 0.12500000 | 0.12500000 | 0.25000000 | 0.01960100 | 0.00245013 | | |
| DGDBTC | SELL | FILLED | 0.25 | 0.25000000 | -0.25000000 | 0.00000000 | 0.02005900 | | 0.00501475 | |
| DGDBTC | BUY | FILLED | 0.113 | 0.11300000 | 0.11300000 | 0.11300000 | 0.02203000 | 0.00248939 | | |
| DGDBTC | BUY | FILLED | 0.113 | 0.11300000 | 0.11300000 | 0.22600000 | 0.02160300 | 0.00244114 | | |
| DGDBTC | BUY | EXPIRED | 0.226 | 0.00000000 | 0.00000000 | | 0.02125500 | | | |
| DGDBTC | BUY | PARTIAL | 0.226 | 0.15800000 | 0.15800000 | 0.38400000 | 0.02126100 | | | |
| DGDBTC | SELL | EXPIRED | 0.384 | 0.00000000 | 0.00000000 | | 0.02196600 | | | |
| DGDBTC | SELL | EXPIRED | 0.384 | 0.00000000 | 0.00000000 | | 0.02214300 | | | |
| DGDBTC | SELL | EXPIRED | 0.384 | 0.00000000 | 0.00000000 | | 0.02189900 | | | |
| DGDBTC | BUY | FILLED | 0.384 | 0.38400000 | 0.38400000 | 0.76800000 | 0.02082900 | 0.00799834 | | |
| DGDBTC | BUY | FILLED | 0.768 | 0.76800000 | 0.76800000 | 1.53600000 | 0.01984300 | 0.01523942 | | |
| DGDBTC | SELL | EXPIRED | 1.536 | 0.00000000 | 0.00000000 | | 0.02074400 | | | |
| DGDBTC | SELL | EXPIRED | 1.536 | 0.00000000 | 0.00000000 | | 0.02094100 | | | |
| DGDBTC | SELL | EXPIRED | 1.536 | 0.00000000 | 0.00000000 | | 0.02076800 | | | |
| DGDBTC | SELL | PARTIAL | 1.536 | 0.30300000 | -0.30300000 | 1.23300000 | 0.02065000 | | | |
| DGDBTC | SELL | FILLED | 1.233 | 1.23300000 | -1.23300000 | 0.00000000 | 0.02070000 | | 0.02552310 | |
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
This a subset of the data group by symbol. For each symbol I want to populate the last column with a value that complies with the following rules:
- Each buy order (side=BUY) in a series has the value zero (0).
- For each consecutive buy order the value is increased by one (1).
- When a sell order (side=SELL) is reached it marks a new buy order serie.
- Rows with status EXPIRED are skipped.
Example:
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
| symbol | side | status | origQty | executedQty | qty | availableQty | price | boughtValue | soldValue | dcaLevel |
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
| DGDBTC | BUY | FILLED | 0.125 | 0.12500000 | 0.12500000 | 0.12500000 | 0.02000700 | 0.00250088 | | 0 |
| DGDBTC | BUY | FILLED | 0.125 | 0.12500000 | 0.12500000 | 0.25000000 | 0.01960100 | 0.00245013 | | 1 |
| DGDBTC | SELL | FILLED | 0.25 | 0.25000000 | -0.25000000 | 0.00000000 | 0.02005900 | | 0.00501475 | |
| DGDBTC | BUY | FILLED | 0.113 | 0.11300000 | 0.11300000 | 0.11300000 | 0.02203000 | 0.00248939 | | 0 |
| DGDBTC | BUY | FILLED | 0.113 | 0.11300000 | 0.11300000 | 0.22600000 | 0.02160300 | 0.00244114 | | 1 |
| DGDBTC | BUY | EXPIRED | 0.226 | 0.00000000 | 0.00000000 | | 0.02125500 | | | |
| DGDBTC | BUY | PARTIAL | 0.226 | 0.15800000 | 0.15800000 | 0.38400000 | 0.02126100 | | | 2 |
| DGDBTC | SELL | EXPIRED | 0.384 | 0.00000000 | 0.00000000 | | 0.02196600 | | | |
| DGDBTC | SELL | EXPIRED | 0.384 | 0.00000000 | 0.00000000 | | 0.02214300 | | | |
| DGDBTC | SELL | EXPIRED | 0.384 | 0.00000000 | 0.00000000 | | 0.02189900 | | | |
| DGDBTC | BUY | FILLED | 0.384 | 0.38400000 | 0.38400000 | 0.76800000 | 0.02082900 | 0.00799834 | | 3 |
| DGDBTC | BUY | FILLED | 0.768 | 0.76800000 | 0.76800000 | 1.53600000 | 0.01984300 | 0.01523942 | | 4 |
| DGDBTC | SELL | EXPIRED | 1.536 | 0.00000000 | 0.00000000 | | 0.02074400 | | | |
| DGDBTC | SELL | EXPIRED | 1.536 | 0.00000000 | 0.00000000 | | 0.02094100 | | | |
| DGDBTC | SELL | EXPIRED | 1.536 | 0.00000000 | 0.00000000 | | 0.02076800 | | | |
| DGDBTC | SELL | PARTIAL | 1.536 | 0.30300000 | -0.30300000 | 1.23300000 | 0.02065000 | | | |
| DGDBTC | SELL | FILLED | 1.233 | 1.23300000 | -1.23300000 | 0.00000000 | 0.02070000 | | 0.02552310 | |
+--------+------+---------+---------+-------------+-------------+--------------+------------+-------------+------------+----------+
I have tried the following two ways.
merged_df['dcaLevel'] = merged_df[(merged_df['side'] == 'BUY') & (merged_df['status'].isin(['FILLED', 'PARTIAL']))].groupby(['symbol'])['dcaLevel'].apply(lambda x: x.shift(1) + 1)
This way throws an error.
merged_df['dcaLevel'] = merged_df[(merged_df['side'] == 'BUY') & (merged_df['status'].isin(['FILLED', 'PARTIAL']))].groupby(['symbol'])['dcaLevel'].apply(lambda x: 0 if x.shift(1) else x.shift(1) + 1)
I tried the following alternative approach.
symbol_df = merged_df.loc[merged_df['symbol'] == 'DGDBTC']
tmp_df = symbol_df[(symbol_df['side'] == 'BUY') & (symbol_df['status'].isin(['FILLED', 'PARTIAL']))]
tmp_df['dcaLevel'] = np.where(tmp_df['availableQty'] < tmp_df['availableQty'].shift(1), 0, tmp_df['dcaLevel'].shift(1) + 1)
It works on some rows and not on others and the first buy order in a serie remains NaN.
I have coded the following which does work, but I'm sure there is an easier way to do this with Pandas.
merged_df['dcaLevel'] = np.NaN
grouped = merged_df[merged_df['status'].isin(['FILLED', 'PARTIAL'])].groupby(['symbol'])
col_idx = merged_df.columns.get_loc('dcaLevel')
for name, group in grouped:
first = True
for index, row in group.iterrows():
if row['side'] == 'SELL':
first = True
dca_level = np.NaN
else:
if first:
first = False
dca_level = 0
else:
dca_level = dca_level + 1
merged_df.iloc[index, col_idx] = dca_level
merged_df[merged_df['symbol'] == 'DGDBTC']
I hope someone can help solve this problem.