話說最近真係將self hosted AI帶入工作上
貼文的作用 2024-12-5 09:27:21 finetune

Ads

ArgonArgon 2024-12-5 09:41:46 想問下Self host 的有冇Model 係 冇道德枷鎖嘅版本?
好多野都唔答我
大王子小王子 2024-12-5 09:46:24 點樣為之假ai
:^(

我用llm再fine tune 返自己想要果d有咩分別
終極ON9平井桃 2024-12-5 10:09:01 Me
Use joined github copilot + OpenRouter
Write code use vscode + copilot
如細 project 一開始會用 Cline + OpenRouter API (Sonnet) gen init code

So far ok
極北鷲 2024-12-5 10:13:08
去r/localllama search "uncensored"或"abliterated"
大棍巴 2024-12-7 01:15:50 Llama-3.3-70B-Instruct
:^(


128K context, multilingual, enhanced tool calling, outperforms Llama 3.1 70B and comparable to Llama 405B 🔥

Comparable performance to 405B with 6x LESSER parameters
Improvements (3.3 70B vs 405B):
GPQA Diamond (CoT): 50.5% vs 49.0%
Math (CoT): 77.0% vs 73.8%
Steerability (IFEval): 92.1% vs 88.6%

Improvements (3.3 70B vs 3.1 70B):
Code Generation:
HumanEval: 80.5% → 88.4% (+7.9%)
MBPP EvalPlus: 86.0% → 87.6% (+1.6%)
Steerability:
IFEval: 87.5% → 92.1% (+4.6%)

Reasoning & Math:
GPQA Diamond (CoT): 48.0% → 50.5% (+2.5%)
MATH (CoT): 68.0% → 77.0% (+9%)

Multilingual Capabilities:
MGSM: 86.9% → 91.1% (+4.2%)
MMLU Pro:
MMLU Pro (CoT): 66.4% → 68.9% (+2.5%)

https://www.reddit.com/r/LocalLLaMA/comments/1h85ld5/llama3370binstruct_hugging_face/
Peter_Pan 2024-12-7 13:27:40
:^(
要幾多張4090先run到
大棍巴 2024-12-7 13:54:17 2x4090 = 48 GB RAM
Llama 70B Q4_K_M 42.52GB
https://huggingface.co/bartowski/Llama-3.3-70B-Instruct-GGUF
極北鷲 2024-12-10 15:40:18 https://ollama.com/blog/structured-outputs

Ollama now supports structured outputs making it possible to constrain a model’s output to a specific format defined by a JSON schema. The Ollama Python and JavaScript libraries have been updated to support structured outputs.

:^(
Junglist 2024-12-10 22:44:21 啱啱好夠
:^(
:^(
:^(
全倉NVDA 2024-12-11 15:51:22 文字控制而家仲係好弱
最後變左都係gen圖再疊字

Ads

3HongKong 2024-12-12 15:52:14 我目標都唔係丁圖, 係想整D 有意義D 既, 過年 生日卡之類
10蚊跟機 2024-12-16 08:15:30 每日既cost
aws ec2 free tier:$0
rpi5電費:$0.4
api cost:$ chatgpt 4o-mini,我每日用$0.5左右
google search json api(,每日100個免費query): $0
唔洗vpn, mobile pc都上到每日收費唔洗1蚊
:^(
大棍巴 2024-12-30 13:05:39 唔想資料送中,Deepseek 3而家OpenRouter有得揀Together.ai做inference
:^(


似乎今次Deepseek 3條license好似係royality-free,遲啲應該好多host都會可以用。中國佬哩鋪值得一贊
:^(
全倉NVDA 2024-12-30 13:36:29 open source LLM 中國佬直頭屈哂機
:^(

大家真係唔使擔心無AI用
hevangel 2024-12-30 14:00:09 搞咁多野,不如比錢用三大嘅API好過喇。
自已host個LLM,有乜野用途?
hevangel 2024-12-30 14:02:54 差不多是Github Copilot第一批公司客,用咗成年幾。佢係VSCode個integration真係最勁。
hevangel 2024-12-30 14:07:32 中共野,仲要close source,你夠唔夠膽用?
close source死路一條,被迫open source。
大棍巴 2024-12-30 14:40:07 其實啲鬼佬用來打code,冇人會理佢答唔答到你8964。
佢只要夠平+好,唔open weight都有人用。

而家Deepseek 個API講到明會用你啲data去train,Reddit啲人一樣照用
:^(
踢波泊大巴 2024-12-30 16:30:11 因為文字多, ARM 架構 個vram 係打通,所以高過3080 嘅10gb.
踢波泊大巴 2024-12-30 16:32:11 公司野驚data leakages, local llm +RAG rap,up 完先pass 比claude api revamp answer.

Ads

踢波泊大巴 2024-12-30 16:35:00 出左jetson nano super 都改變左生態.
兼職陰陽師 2024-12-30 18:39:46 係咪得我一個唔鍾意blog thumbnail用ai圖
:^(

硬係唔知點咁
大棍巴 2024-12-30 18:52:36 而家已經痴哂線,最難頂係嗰堆YouTube AI product rumor video,完全9 up,錯到七彩
:^(
Enterprise 2024-12-30 19:07:13 我最唔鍾意係 ai 播新聞
哂9氣
:^(