first commit

2025-12-15 22:10:02 +05:30
commit 21893043ae
4 changed files with 13028 additions and 0 deletions
@@ -0,0 +1,73 @@
+# Image Annotation Project
+
+This repository contains Jupyter notebooks for image annotation using state-of-the-art vision-language models. The project focuses on image understanding, segmentation, and COCO format conversion.
+
+## Notebooks
+
+### 1. Image_Annotation_Testing_Satyam.ipynb
+
+This notebook provides testing capabilities for image annotation using advanced vision-language models. It includes various experiments to evaluate the performance and capabilities of the models in understanding and annotating images.
+
+### 2. Moondream_Segmentation_Satyam.ipynb
+
+This notebook implements segmentation capabilities using the Moondream vision-language model. It focuses on segmenting objects within images and generating precise boundaries for different objects in the scene.
+
+### 3. Moondream3_to_COCO_Satyam.ipynb
+
+This notebook handles the conversion of annotations to the COCO (Common Objects in Context) format. It takes segmented objects and converts them into a standardized JSON format suitable for training computer vision models.
+
+## Prerequisites
+
+To run these notebooks, you'll need:
+
+- Python 3.8+
+- Jupyter Notebook or JupyterLab
+- PyTorch
+- Transformers
+- Pillow
+- NumPy
+- OpenCV
+- Moondream model dependencies
+
+## Setup
+
+1. Clone or download this repository
+2. Install required dependencies:
+
+```bash
+pip install torch torchvision
+pip install transformers pillow numpy opencv-python
+```
+
+3. Launch Jupyter:
+
+```bash
+jupyter notebook
+```
+
+4. Open any of the notebooks and run the cells
+
+## Usage
+
+Each notebook can be run independently depending on your specific needs:
+
+1. Use `Image_Annotation_Testing_Satyam.ipynb` to test and evaluate image annotation capabilities
+2. Use `Moondream_Segmentation_Satyam.ipynb` for object segmentation tasks
+3. Use `Moondream3_to_COCO_Satyam.ipynb` to convert annotations to COCO format
+
+## Dependencies
+
+- [Moondream](https://github.com/vikhyat/moondream) - Vision-language model
+- PyTorch - Deep learning framework
+- OpenCV - Computer vision library
+- COCO API - For annotation format handling
+
+## Notes
+
+- Ensure you have sufficient GPU memory for running vision-language models
+- Models may require internet connectivity for initial downloads
+- Results may vary depending on the complexity of the images
+
+## Author
+
+Satyam - Image Annotation Project