Philip May - Data Science and IT#

I’m Philip May, data scientist expert and open source enthusiast with an NLP focus.
I come from Germany and work for Deutsche Telekom.

This website is a mixture of documentation, blog and personal notes.

About Me
Machine Learning · Python · IT · Linux

Blog#

The importance of chat templates - April 11, 2024
Pandas apply - November 18, 2023
Options for Date Encoding - October 12, 2022
Python Installation and Package Management with conda and pip - July 23, 2022
Anomalies in the MLSUM Dataset - February 23, 2022
Clean German Wikipedia Text Corpus released - February 22, 2022
LightGBM with Optuna: Demo released - February 20, 2022
German colossal, cleaned Common Crawl corpus (GC4) released - April 10, 2021
Training and Evaluation of our German Electra Language Model Talk - December 01, 2020