Questions tagged [groupby]

A groupby operation: splits the data, applies a function to each group, and combines the results.

The Pandas library offers a handy explanation.

31 questions
5
votes
1 answer

How to use df.groupby() to select and sum specific columns w/o pandas trimming total number of columns

I got Column1, Column2, Column3, Column4, Column5, Column6 I'd like to group Column1 and get the row sum of Column3,4 and 5 When I apply groupby() and get this that is correct but it's leaving out Column6: df =…
Steven
  • 119
  • 2
  • 4
  • 15
3
votes
0 answers

User defined aggregations on data of around 200GB where row order matters

I am working with "medium large" data of around 200GB. The data are long form log files, where there are several thousand logs for each "entity". The entities are actually flights and each log entry occurs at a different time stamp. Temporal order…
Placidia
  • 226
  • 1
  • 5
2
votes
1 answer

Normalize data from different groups

I have data that has been grouped into 27 groups by different criteria. The reason for these groupings is to show that each group has different behavior. However, I would like to normalize everything to the same scale. For example, I would like to…
formicaman
  • 141
  • 2
1
vote
1 answer

pandas groupby.count doesn't count zero occurrences

I am using groupby.count for 2 columns to get value occurrences under a class constraint. However, if value $x$ in feature never occurs with class $y$, then this pandas method returns only non-zero frequencies. Is there any solution or alternate…
Şafak
  • 23
  • 4
1
vote
0 answers

MongoDB Groupby Rank

Im Working With Mongodb And Wanted to do a query using Aggregate fucntion. Query Is Each city has several zip codes. Find the city in each state with the most number of zip codes and rank those cities along with the states using the city…
Noob
  • 11
  • 1
1
vote
0 answers

Why is the behaviour of groupby command different in R and in python with pandas?

if I execute the following lines of codes in R as well as python for groupby command, the order of the resulting dataframes are not matching when I convert them to dataframes after that. How to maintain the order to be same when I execute the R code…
Malathi
  • 135
  • 1
  • 1
  • 5
1
vote
1 answer

pandas groupby and sort values

I am studying for an exam and encountered this problem from past worksheets: This is the data frame called 'contest' with granularity as each submission of question from each contestant in the math contest. The question is and the answer is in…
JChang
  • 13
  • 2
1
vote
1 answer

Pandas - Sum of multiple specific columns

I created this script: import pandas as pd pd.set_option('display.min_rows', None) pd.set_option('display.max_columns', None) df = pd.read_excel('file.xlsx', sep=';', skiprows=6) df = df.drop(['Position',…
Steven
  • 119
  • 2
  • 4
  • 15
1
vote
0 answers

Custom DataFrame format for exporting to excel sheets

I have the following DataFrame and I want it exported in an excel file with different sheets having different words as shown in the image 0 2020-06-14 coronavirus es -0.021277 coronavirus is -0.006024 1 2020-06-13 coronavirus es -0.021277…
m2rik
  • 321
  • 1
  • 9
1
vote
1 answer

How to group by one column and count frequency from other column for each item in the previous column in python?

I am trying to group my data by the 'ID' column. Then I want to count the frequency of 'sequence' for each 'ID'. Here is a sample of the data frame: ID Sequence 101 1-2 101 3-1 101 1-2 102 4-6 102 7-8 102 4-6 102 4-6 103 …
Farah
  • 21
  • 2
1
vote
1 answer

Group_by 2 variables and pivot_wider distribution based on 2 others

Performing some calculations on a dataframe and stuck trying to calculate a few percentages. Trying to append 3 additional columns added for %POS/NEG/NEU. E.g., the sum of amount col for all observations w/ POS Direction in both Drew & A/total sum…
DataGuy23
  • 31
  • 1
  • 4
1
vote
0 answers

Pushing down Group By clause

I've studied this exercise in class But I cannot figure out why, when I push down the Group By clause, I can remove RLCode attribute from GroupBy. Does this action change the meaning of query tree?
Lorenzoi
  • 25
  • 4
1
vote
1 answer

Grouping by Multi-Indices of both Row and Column

I have created a table using Pandas following material from here. The table created makes use of Multi-Indices for both columns and rows. I am trying to compute the descriptive statistics for each year and subject, meaning, displaying for instance…
Mar
  • 21
  • 2
0
votes
1 answer

Using user defined function in groupby

I am trying to use the groupby functionality in order to do the following given this example dataframe: dates = ['2020-03-01','2020-03-01','2020-03-01','2020-03-01','2020-03-01', …
0
votes
0 answers

Accessing group elements in groupby

I had a doubt about groupby operations. Lets suppose I have grouped my data based on one column and got 5 groups as output. I know that we can iterate over these groups and apply functions to them as a whole. But can I access the elements of each…
1
2 3