Data Efficacy for Language Model Training
-
Updated
May 29, 2026 - Python
Data Efficacy for Language Model Training
Curation of BIDS (CuBIDS): A sanity-preserving software package for processing BIDS datasets.
Application to help you track, categorize, and rate the fanmade music videos you watch (with a focus on anime music videos)
Sanitize and organize astrophotography subframes from ASIAir or DSLR captured data. Organize files by folder for WBPP keywords, cleanup empty directories, remove jpg previews, rename RAW files using EXIF data.
A file management system for turning digital chaos into an organized archive. Includes scripts to sort, analyze, and consolidate files.
A React + Vite application for managing hospital appointments. Add patients, view patient lists, and navigate between pages seamlessly.
This application is designed to simplify the process of collecting and managing leads for an organization. It provides an intuitive user interface and several useful features to streamline data entry, organization, and follow-up activities.
Organize experimental data in a structured and coherent manner
copyright-stats-extractor parses headlines/articles on digital copyright enforcement to auto‑extract stats like takedown counts, year, and parties.
Applying data visualization techniques; organizing, cleaning, and aggregating data from multiple sources; I determine how much happiness has changed over time in all regions of the world
rankextractplus extracts and structures ranked info from text, organizing data for easier comparison and analysis.
Extracts key release details from unstructured text to create clear, structured summaries.
My MTech project focuses on classifying hand gestures using surface Electromyography (sEMG) signals to enable intuitive control for assistive and rehabilitation technologies. I worked with large-scale EMG datasets (80,000×8 signals per trial) and built a complete ML pipeline for accurate classification of the hand postures.
Fields management from/to different data sources. 💡
This is the 'data.aykhan.net' repository, serving as a dedicated static data API. It offers structured endpoints for user profiles, product details, events, and more, simplifying data access for web and software projects. Explore and integrate reliable static data into your applications with ease.
Python automation scripts: web scraping for data analysis
Access computer science history by year, including major breakthroughs, research papers, and technological advancements.
Attempts to correct the arbitrary filenames of photos/videos/audio downloaded from Instagram based on the time they were sent.
Data Organization in Spreadsheets to Ease Further Processing
Add a description, image, and links to the data-organization topic page so that developers can more easily learn about it.
To associate your repository with the data-organization topic, visit your repo's landing page and select "manage topics."