You can get the last value for groups of consecutive days, identical groups, and values using the following steps:
1. Ensure datetime:
df['Date'] = pd.to_datetime(df['Date'])
2. Ensure data is sorted by groups, then date:
tmp = df.sort_values(by=['X', 'Y', 'Date'])
3. Form groups of consecutive data:
group = (tmp['Date'].diff().ne(pd.Timedelta(days=1)) | tmp[['X', 'Y', 'Value']] .ne(tmp[['X', 'Y', 'Value']].shift()) .any(axis=1) ).cumsum()
4. Form grouper:
g = df.groupby(group)
5. Get last row (date) per group:
out = g.last()[g.size().ge(N_days)]
6. Print the output:
print(out)
Output:
Date X Y Value 9 2024-01-20 X2 456 4