浏览代码

Added Embeddings base

Efren Yevale Varela 2 年之前
父节点
当前提交
c922a5b7f2
共有 6 个文件被更改,包括 318 次插入0 次删除
  1. 2 0
      .gitignore
  2. 192 0
      Embeddings/Load.ipynb
  3. 115 0
      Embeddings/Milvus.ipynb
  4. 0 0
      Embeddings/data/.documents
  5. 0 0
      Embeddings/dist/.models
  6. 9 0
      requirements.txt

+ 2 - 0
.gitignore

@@ -1,4 +1,6 @@
 **/.ipynb_checkpoints/
 **/00*.ipynb
 **/*.npy
+Embeddings/data/*
+Embeddings/dist/*
 Milvus/volumes

+ 192 - 0
Embeddings/Load.ipynb

@@ -0,0 +1,192 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9c5c18a1",
+   "metadata": {},
+   "source": [
+    "* LLM: https://huggingface.co/TheBloke/dolphin-2.2.1-mistral-7B-GGUF\n",
+    "* FlagEmbedding: https://huggingface.co/BAAI/bge-small-en-v1.5"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2df4fa0b",
+   "metadata": {},
+   "source": [
+    "# Libraries\n",
+    "\n",
+    "_Python_ bindings for _llama.cpp_ does not include _GPU_ support without the required argument, it can be recompiled with the following command:\n",
+    "```sh\n",
+    "CMAKE_ARGS='-DLLAMA_CUBLAS=on' pip install --force-reinstall --no-cache-dir llama-cpp-python\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "df27797f",
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "fec62a10-af8a-4193-9f0c-b216f0c45723",
+   "metadata": {},
+   "source": [
+    "# Change Directory for Model Downloads"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "08df0af5-5c2c-43f6-ae31-b5984cf55182",
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "environ[\"LLAMA_INDEX_CACHE_DIR\"] = \"dist\""
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4876213b",
+   "metadata": {},
+   "source": [
+    "# Large Language Model"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "092108a6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6a2a2b8f",
+   "metadata": {},
+   "source": [
+    "# FlagEmbedding"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "baa95bb6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2411948b-2a54-46e8-be1c-b3916c32dff8",
+   "metadata": {},
+   "source": [
+    "# Milvus Vector Database\n",
+    "\n",
+    "Of course! Must be running first, change to _Milvus_ directory and run with:\n",
+    "```sh\n",
+    "docker compose up\n",
+    "```"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "957c6f1c-31b7-40e8-ab18-02069802fe70",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "33f35de9-a15c-48ea-82c9-b4e95d9c8c25",
+   "metadata": {},
+   "source": [
+    "# Load Documents from Directoy"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c59696a9-a8f7-41c3-86d5-af2a80ea7fea",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c9401905",
+   "metadata": {},
+   "source": [
+    "# Tying Everything Up"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "bca13d96",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "22a99803",
+   "metadata": {},
+   "source": [
+    "# Query Engine"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "097a8526",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "c07e53c6",
+   "metadata": {},
+   "source": [
+    "## Get a Response"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "7f4d8911",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

+ 115 - 0
Embeddings/Milvus.ipynb

@@ -0,0 +1,115 @@
+{
+ "cells": [
+  {
+   "cell_type": "markdown",
+   "id": "9c5c18a1",
+   "metadata": {},
+   "source": [
+    "* Milvus Vector Database: https://milvus.io"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2df4fa0b",
+   "metadata": {},
+   "source": [
+    "# Libraries"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "df27797f",
+   "metadata": {
+    "scrolled": true
+   },
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "4876213b",
+   "metadata": {},
+   "source": [
+    "# Connection and Collection"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "092108a6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "6a2a2b8f",
+   "metadata": {},
+   "source": [
+    "# Schema Definition"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "baa95bb6",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "2411948b-2a54-46e8-be1c-b3916c32dff8",
+   "metadata": {},
+   "source": [
+    "# Embeddings Count"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "957c6f1c-31b7-40e8-ab18-02069802fe70",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  },
+  {
+   "cell_type": "markdown",
+   "id": "33f35de9-a15c-48ea-82c9-b4e95d9c8c25",
+   "metadata": {},
+   "source": [
+    "# Embeddings Content"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": null,
+   "id": "c59696a9-a8f7-41c3-86d5-af2a80ea7fea",
+   "metadata": {},
+   "outputs": [],
+   "source": []
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3 (ipykernel)",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.11.6"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 5
+}

+ 0 - 0
Embeddings/data/.documents


+ 0 - 0
Embeddings/dist/.models


+ 9 - 0
requirements.txt

@@ -1,4 +1,13 @@
 gymnasium[toy-text]==0.29.1
+ipywidgets==8.1.1
 jupyterlab==4.0.9
+llama_cpp_python==0.2.19
+llama-index==0.9.2
 matplotlib==3.7.1
+pymilvus==2.3.3
+pypdf==3.17.1
 numpy==1.23.5
+torch==2.1.1
+torchaudio==2.1.1
+torchvision==0.16.1
+transformers==4.35.2