Welcome to join.dev
Your home base for developers — watch live streams and talks, track hackathons and releases, follow learning paths, connect your favorite apps, and find your people, all in one activity feed.
Happening now
STREAMLIVE
Building a RAG pipeline from scratch
MC
TALKLIVE
Rust at scale: lessons from prod
DC
HACKATHONStarts in 12 min
AI Agents Hackathon — kickoff
JD
STREAMLIVE
Live: shipping a Next.js app end-to-end
AR
Your feed
??
join.dev memberMODEL5h
Kimi K2.6 lands on Hugging Face with MIT-licensed MoE weights
Moonshot AI's Kimi K2.6 model card is live on the Hub, a 1T-parameter MoE model with a DeepSeek V3-style architecture and native text, image, and video input, released under a modified MIT license.
https://huggingface.co/moonshotai/Kimi-K2.6Models · huggingface · open-weights · moe
??
join.dev memberPOST7h
BitsMoE: spectral bit-allocation for MoE LLM quantization
New paper proposes an SVD-based spectral-energy-guided bit allocation scheme for quantizing Mixture-of-Experts LLMs, claiming a 12.3x faster quantization pass and 1.76x decoding speedup over GPTQ at 2-bit precision on Qwen3-30B-A3B.
https://arxiv.org/abs/2606.00079Papers · quantization · moe · paper
??
join.dev memberNEWS10h
OpenAI reveals Jalapeño, its custom AI inference chip with Broadcom
OpenAI shared plans for Jalapeño, a custom inference chip built with Broadcom, joining Google, Apple, and SpaceX in reducing reliance on Nvidia for AI compute.
https://techcrunch.com/video/why-everyone-from-openai-to-spacex-is-building-their-own-chips-and-turning-up-the-heat-on-nvidia/AI News · ai-news · chips · hardware
??
join.dev memberRELEASE14h
Gemini Embedding 2 goes GA with native multimodal retrieval
Google's Gemini Embedding 2 is now generally available, mapping text, images, video, audio, and PDFs into one embedding space with Matryoshka Representation Learning for flexible 3072/1536/768-dim output.
https://blog.google/innovation-and-ai/models-and-research/gemini-models/gemini-embedding-2-generally-available/Retrieval · embeddings · multimodal · google
??
join.dev memberPOST15h
LiveCodeBench v6 leaderboard: Qwen3.7 Max tops coding evals
The contamination-free LiveCodeBench v6 leaderboard now covers 53 evaluated models across code generation, self-repair, execution, and test-output prediction, with Qwen3.7 Max leading at a 0.916 score.
https://livecodebench.github.io/leaderboard.htmlBenchmarks · benchmarks · code-eval · leaderboard
??
join.dev memberPOST16h
GLM-5.2 beats GPT-5.5 on long-horizon coding benchmarks
Z.ai's open-weight GLM-5.2 outperforms GPT-5.5 on several long-horizon coding benchmarks at roughly 1/6th the cost, a good reminder to weigh cost-per-task alongside raw accuracy when designing eval harnesses.
https://venturebeat.com/technology/z-ais-open-weights-glm-5-2-beats-gpt-5-5-on-multiple-long-horizon-coding-benchmarks-for-1-6th-the-costEvals · benchmarks · evals · coding
??
join.dev memberPOST1d
Anthropic on context engineering for agents
Anthropic's engineering team argues the discipline has moved past prompt wording alone: the real lever is curating the smallest set of high-signal tokens across the whole context window, not just the system prompt.
https://www.anthropic.com/engineering/effective-context-engineering-for-ai-agentsRAG · rag · context-engineering · agents
??
join.dev memberPOST1d
TeleRAG: lookahead retrieval cuts RAG inference latency
A paper proposing lookahead retrieval to prefetch likely-needed documents before the generation step even asks for them, trimming end-to-end RAG latency without hurting answer quality.
https://arxiv.org/pdf/2502.20969RAG · rag · latency · retrieval
??
join.dev memberPOST1d
Hands-on guide: fine-tuning your first LLM with PyTorch + HF
A practical walkthrough of fine-tuning a small open model end to end in PyTorch and Hugging Face Transformers, from tokenization through training loop to eval.
https://huggingface.co/blog/dvgodoy/fine-tuning-llm-hugging-facePyTorch · fine-tuning · pytorch
??
join.dev memberPOST1d
Databricks guide to picking LoRA rank and target modules
Databricks breaks down how rank, alpha, and target-module choice actually trade off against VRAM and overfitting risk when you're doing LoRA fine-tunes.
https://www.databricks.com/blog/efficient-fine-tuning-lora-guide-llmsFine-tuning · fine-tuning · lora
Loading more…