Author: kanakorn.h

[บันทึกกันลืม] JupyterHub Authenticated with OIDC

kanakorn.h

December 9, 2024
ต่อจากตอนที่แล้ว [บันทึกกันลืม] JupyterHub ด้วย Docker

คราวนี้ ถ้าต้องการให้ ยืนยันตัวตนด้วย OpenID เช่น PSU Passport เป็นต้น ก็ให้ทำดังนี้

ในไฟล์ jupyterhub_config.py ใส่ configuration ประมาณนี้
```
c = get_config()  #noqa
c.Authenticator.allow_all = True
c.JupyterHub.authenticator_class = "generic-oauth"
c.JupyterHub.spawner_class = 'jupyterhub.spawner.LocalProcessSpawner'
c.Spawner.new_user = True
import pwd
import subprocess
def pre_spawn_hook(spawner):
    username = spawner.user.name
    try:
        pwd.getpwnam(username)
    except KeyError:
        import subprocess
        subprocess.check_call(['useradd', '-ms', '/bin/bash', username])
c.Spawner.pre_spawn_hook = pre_spawn_hook
c.Authenticator.admin_users=set(('admin'))
c.GenericOAuthenticator.client_id = "__client__id__"
c.GenericOAuthenticator.client_secret = "__client_secret__"
c.GenericOAuthenticator.redirect_uri = '/oauth2/callback'
c.GenericOAuthenticator.username_claim = 'preferred_username'
c.GenericOAuthenticator.authorize_url = "https://IDP_URL/application/o/authorize/"
c.GenericOAuthenticator.token_url = "https://IDP_URL/application/o/token/"
c.GenericOAuthenticator.userdata_url = "https://IDP_URL/application/o/userinfo/"
c.GenericOAuthenticator.scope = "openid email profile"
c.GenericOAuthenticator.grant_type = "authorization_code"
```
อธิบายเพิ่มเติมนิดนึง
- c.JupyterHub.authenticator_class = “generic-oauth”
  ใช้ generic-oauth ในการยืนยันตัวตน
- สร้าง pre_spawn_hook มา เพื่อให้สร้าง local user
- c.Spawner.pre_spawn_hook = pre_spawn_hook
  จากนั้นบอก Spawner ให้ใช้ function pre_spawn_hook ข้างต้น
- c.GenericOAuthenticator
  ต่าง ๆ เหล่านั้น เป็น config ให้ระบบไปติดต่อ IdP
- c.GenericOAuthenticator.redirect_uri = ‘/oauth2/callback’
  อันนี้สำคัญ คือพอ Authenticate เสร็จก็ส่งมาที่นี่ เพื่อ เข้าสู่ระบบ
- c.GenericOAuthenticator.username_claim = ‘preferred_username’
  อันนี้ ระบุว่า จะเอา field อะไรเป็น username
- ทำตามนี้แหล่ะ ลองมาแล้ว
จากนั้น
```
docker cp jupyterhub_config.py jupyterhub01:/srv/jupyterhub/jupyterhub_config.py 
docker restart jupyterhub01 
docker logs -f jupyterhub01 
```
ถ้าทุกอยากเรียบร้อยดี ก็จะได้หน้าจอประมาณนี้ (อันนี้ทำเร็ว ๆ ยังไม่ได้ใส่ HTTPS)

เมื่อคลิก Sign in with OAuth2.0 ก็จะวิ่งไปหา IdP ที่ตั้งไว้ (ในที่นี้ใช้ Authentik ของหน่วยงาน ซึ่ง แยกออกจากของมหาวิทยาลัยอีกที — เดี๋ยวเขียนวิธีการทำอีกที)

ไหลไปเรื่อย ๆ ก็จะเจอหน้า Authentication ของมหาวิทยาลัย

แล้วก็กลับมาที่ IdP ของเรา

จากนั้นก็เข้าใช้งาน JupyterHub ได้ตามปรกติ

หวังว่าจะเป็นประโยขน์ครับ
December 9, 2024
[บันทึกกันลืม] JupyterHub ด้วย Docker

kanakorn.h

December 6, 2024
นาน ๆ ทำที บันทึกไว้ก่อน

รันคำสั่งนี้ เพื่อ ติดตั้ง JupyterHub รุ่นที่ทดลอง 5.2.1 ใน Docker
```
docker run -d -p 8000:8000 --name jupyterhub01 quay.io/jupyterhub/jupyterhub jupyterhub
docker exec -it jupyterhub01 bash -c "pip install jupyterlab oauthenticator"
docker exec -it jupyterhub01 bash -c "jupyterhub --generate-config"
docker exec -it jupyterhub01 bash -c "adduser admin"
docker exec -it jupyterhub01 bash -c "adduser jupyteruser01"
```
จะสร้าง user ชื่อ admin และ jupyteruser01 โดย admin จะเป็น Administrator ส่วน jupyteruser01 ให้เป็น user ทั่วไป ตรงนี้จะสร้าง user อยู่ภายใน docker container (ขั้นตอนนี้จะมี prompt ให้ตั้งรหัสผ่านของ admin และ jupyteruser01 ตามลำดับ)

container นี้ จะชื่อว่า ‘jupyterhub01’ เข้าถึงได้ที่ port 8000 (หากต้องการเปลี่ยนก็ทำได้)

จานั้น copy file จากข้างใน docker ออกมา
```
docker cp jupyterhub01:/srv/jupyterhub/jupyterhub_config.py .
```
จากนั้น แก้ไข jupyterhub_config.py อย่างน้อย ต้องมีการตั้งค่าดังนี้
```
c = get_config()  #noqa
c.JupyterHub.authenticator_class = 'jupyterhub.auth.PAMAuthenticator'
c.Authenticator.admin_users = ('admin')
c.Authenticator.allow_all = True
```
บรรทัดล่างสุดนั้น เป็นการอนุญาตให้ทุกคนที่เข้าใช้งานได้ และให้ ‘admin’ มีหน้าที่เป็น Admin

จากนั้น ให้ copy file นี้ เข้าไปใน Docker ด้วยคำสั่ง
```
docker cp jupyterhub_config.py jupyterhub01:/srv/jupyterhub/jupyterhub_config.py
docker restart jupyterhub01
```
ซึ่งคนที่เป็น admin จะมีหน้า Admin ให้ใช้งาน จากการคลิก File > Control Panel > Admin หน้าตาดังภาพ

admin สามารถเข้าไปใน Session ของผู้ใช้คนอื่นได้ เหมาะสำหรับใช้กับงานด้าน การเรียนการสอน

หวังว่าจะเป็นประโยชน์ครับ
December 6, 2024
[บันทึกกันลืม] AnythingLLM with Claude.ai

kanakorn.h

September 5, 2024

AnythingLLM เป็นเครื่องมือทำ RAG (retrieval-augmented generation) โดยอาศัย LLM (Large Language Model) ต่าง ๆ เช่น Llama, Mistral, Gemma ซึ่งสามารถใช้แบบ Local LLM ได้ ผ่าน Ollama หรือ LM Studio และกลุ่มที่เป็น Cloud LLM อย่าง GPT-4o, Gemini และ Claude Sonet เป็นต้น

RAG ต่างจากการใช้ Chatbot คือ เราสามารถให้ LLM ทำความเข้าใจ Context หรือ บริบท ของข้อมูลในองค์กรเราได้ ซึ่งแน่นอนว่า Local LLM ย่อมเป็นส่วนตัวกว่า แต่ ในบางกรณี เราก็ต้องการพลังที่เหนือกว่าของ Cloud LLM

ในที่นี้จะ Claude.ai ซึ่งเราต้องมี API key โดยทำตามขั้นตอนดังนี้

0. เริ่มจาก Getting started – Anthropic แล้วคลิกที่ Console

1. จากนั้นเข้าสู่ระบบ ของผมใช้ Google Account จากนั้น กรอก Organization เพื่อ Create Account

2. คลิกที่ Get API keys แล้วก็ Create API

3. ใส่ API Key name ก็จะได้ API Key มา, เก็บในที่ที่ปลอดภัย แล้วก็เอาไปใส่ใน AnythingLLM ดังภาพ (เลือก LLM Provider ให้ตรงกับที่ต้องการ ในที่นี้เป็น Anthropic) และเลือกตัว Model ที่ต้องการใช้ จากนั้นกด Save Change

4. ลองกลับมาใช้ AnythingLLM ซึ่งตอนนี้จะเชื่อมกับ Anthropic Claude.ai แล้ว แต่เป็นบริการที่ต้องเสียตังค์อ่ะนะ ไปเติมเงินแล้วค่อยมาใช้ 555

หวังว่าจะเป็นประโยชน์ครับ

September 5, 2024

[บันทึกกันลืม] วิธีเพิ่ม Node (Ubuntu 22.04) เข้า Kubernetes cluster (version 1.26.15)

kanakorn.h

August 23, 2024

หากเป็น node เก่า อย่าลืมทำ [บันทึกกันลืม] K8S เอา node เดิม join กลับเข้ามาไม่ได้ เป็นปัญหาเพราะ CNI

Adding a new node running Ubuntu 22.04 to Kubernetes version 1.26.15 cluster. Hope this help.

swapoff -a
sed -i 's/\/swap.img/#\/swap.img/g' /etc/fstab
echo 1 > /proc/sys/net/ipv4/ip_forward
modprobe overlay
modprobe br_netfilter
sysctl --system
apt install -y curl gnupg2 software-properties-common apt-transport-https ca-certificates
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | apt-key add -
add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/ubuntu $(lsb_release -cs) stable"
apt update -y
apt install -y containerd
mkdir /etc/containerd
containerd config default > /etc/containerd/config.toml
sed -i 's/SystemdCgroup = false/SystemdCgroup = true/g' /etc/containerd/config.toml
systemctl restart containerd
systemctl enable containerd
apt -y install curl vim git wget apt-transport-https gpg
mkdir /etc/apt/keyrings
curl -fsSL https://pkgs.k8s.io/core:/stable:/v1.26/deb/Release.key | sudo gpg --dearmor -o /etc/apt/keyrings/kubernetes-apt-keyring.gpg
echo 'deb [signed-by=/etc/apt/keyrings/kubernetes-apt-keyring.gpg] https://pkgs.k8s.io/core:/stable:/v1.26/deb/ /' | sudo tee /etc/apt/sources.list.d/kubernetes.list
apt update -y
apt install -y  kubelet=1.26.15-1.1 kubeadm=1.26.15-1.1
apt-mark hold kubelet kubeadm kubectl
systemctl enable kubelet

August 23, 2024

[บันทึกกันลืม] Virtualization ด้วย KVM และ Cockpit

kanakorn.h

July 25, 2024
จุดประสงค์: เพื่อให้ใช้งาน physical server ได้เต็มประสิทธิภาพ

พอดีใช้ Kubernetes จนถึง ลิมิต 110 pods / node ทำไงดี CPU/Ram เหลือ เลยคิดจะทำ Virtualization ขึ้นไปอีกชั้น จากนั้นก็เอามา join เข้า cluster อีกเครื่อง ทำให้สร้าง 220 pods / nodes เอาว่า เป็นเพื่อการทดลอง แต่ใครมี server ใช้งานไม่เต็มประสิทธิภาพ จะใช้ vmware ก็เกรงจะต้องเสียตังค์ หรือ ไม่อยากไปใช้ promox ve ซึ่งไม่รู้เมื่อไหร่จะต้องเสียตังค์ ก็ลองดูวิธีนี้ได้

ติดตั้ง KVM บน Ubuntu 22.04
```
sudo apt update && sudo apt install -y  \
                    qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils \
                    nfs-common virtinst libvirt-daemon virt-manager
```
ติดตั้ง Cockpit
```
. /etc/os-release
sudo apt install -y -t ${VERSION_CODENAME}-backports cockpit cockpit-machines
```
สร้าง VM
```
virt-install \
  --name vm01 \
  --memory 2048 \
  --vcpus 2 \
  --cdrom ubuntu-22.04.4-live-server-amd64.iso \
  --disk path=/data/kvm/disk/vm01_disk.qcow2,size=100 \
  --os-variant ubuntu22.04 \
  --network type=direct,source=bond0,model=virtio \
  --graphics vnc
```
ใช้งาน cockpit

http://server-ip-address:9090
July 25, 2024
[บันทึกกันลืม] K8S เอา node เดิม join กลับเข้ามาไม่ได้ เป็นปัญหาเพราะ CNI

kanakorn.h

March 18, 2024
ปัญหา: ในบางกรณี เราต้อง delete node ออกไป แต่บางที join กลับได้ แต่ ไม่สามารถ allocate pod ไปได้
```
sudo kubectl describe pod/thepod -n thenamespace
```
แล้วพบว่า
```
Error syncing pod, skipping: failed to "SetupNetwork" for "thepod" with SetupNetworkError: "Failed to setup network for pod \"...(...)\" using network plugins \"cni\": no IP addresses available in network: podnet; Skipping pod"
```
ให้ทำดังต่อไปนี้ กับ nodeX ที่มีปัญหา (ดูจาก subnet ที่ปรากฏ) ด้วยคำสั่งต่อไปนี้

เริ่มจากไปที่ control plane แล้ว delete nodeX ออกไป
```
sudo kubectl delete node nodeX
sudo kubeadm token create --print-join-command
```
แล้ว copy คำสั่งที่ได้มา หน้าตาประมาณนี้
```
kubeadm join ip.of.control.plane:6443 --token xxxxxxxxxx --discovery-token-ca-cert-hash sha256:yyyyyyyyyyyyyyyyyyyyyy
```
จากนั้นไปที่ nodeX
```
sudo systemctl stop containerd
sudo systemctl stop kubelet
sudo ip link set cni0 down
sudo ip link set flannel.1 down
sudo brctl delbr cni0
rm -rf /run/flannel/subnet.env
rm -rf /etc/kubernetes/kubelet.conf
rm -rf /etc/kubernetes/pki/ca.crt
sudo systemctl start containerd
kubeadm join ip.of.control.plane:6443 --token xxxxxxxxxx --discovery-token-ca-cert-hash sha256:yyyyyyyyyyyyyyyyyyyyyy
```
ก็จะใช้งานได้แล้ว

Note: ถ้าพบปัญหา CNI อีก ให้ลองที่ node นั้น ด้วยคำสั่ง
```
 systemctl restart containerd 
```
คาดหวังว่าจะได้ cni interface เกิดขึ้น จากการดูคำสั่ง
```
ip addr | grep cni
```
March 18, 2024
[บันทึกกันลืม] ปัญหา Kubernetes มี Disk Pressure แล้วทำให้ pods อยู่ในสถานะ Evicted ค้างจำนวนมาก

kanakorn.h

March 18, 2024
ปัญหา: เมื่อใช้คำสั่ง kubectl get pod -A แล้ว พบว่า มี pod แสดงสถานะ Evicted เป็นจำนวนมาก (จริง ๆ แล้วมีสถานะอื่นที่ไม่ใช่ Running จำนวนมาก)

ตรวจสอบ: สันนิษฐานว่า Disk เต็ม ใช้คำสั่ง kubectl describe node | grep -i DiskPressure พบว่า KubeletHasDiskPressure แสดงว่า มีปัญหาอะไรสักอย่างกับ Storage และเมื่อดูด้วย kubectl logs -f pod/thepodname -n thenamespace พบว่า “attempting to reclaim ephemeral-storage”

แนวทางการแก้ปัญหา: แต่ละ node ติดตั้งแบบแบ่ง Partition ให้ OS เป็น / (พื้นที่ 100 GB) และ พื้นที่ใช้งานจริง เป็น /data (2.5 TB) ตรวจสอบด้วยคำสั่ง df -h / พบว่า มีการใช้พื้นที่ เกิน 80%

ปัญหานี้เคยเกิดขึ้นกับตอนใช้ Docker แก้ไขโดยการย้าย /var/lib/docker ไปไว้ที่ /data/docker ซึ่งมีพื้นที่มากกว่า แต่ในระบบ Kubernetes ใช้ containerd และ kubelet

วิธีย้าย containerd ไปไว้ใน /data ตามลำดับ

ที่ nodeX (ควรทำทีละ node)
```
sudo mkdir /data/containerd
sudo mkdir /data/containerd/var
sudo mkdir /data/containerd/run
sudo mkdir /data/kubelet
```
กลับมาที่ control plane, ให้ทำการ drain ด้วยคำสั่ง ต่อไปนี้ แล้วรอจนเสร็จ
```
sudo kubectl drain --delete-emptydir-data --ignore-daemonsets nodeX
```
กลับมาที่ nodeX ทำการย้ายข้อมูลของ containerd และ kubelet มาไว้ที่ใหม่
```
sudo systemctl stop containerd
sudo systemctl stop kubelet
sudo rsync -av /var/lib/containerd/ /data/containerd/var/
sudo rsync -av /run/containerd/ /data/containerd/run/
sudo rsync -av /var/lib/kubelet/ /data/kubelet/
```
แก้ไข /etc/systemd/system/kubelet.service.d/10-kubeadm.conf เพิ่ม –root-dir=/data/kubelet
```
# Note: This dropin only works with kubeadm and kubelet v1.11+
[Service]
Environment="KUBELET_KUBECONFIG_ARGS=--bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf  --root-dir=/data/kubelet"
Environment="KUBELET_CONFIG_ARGS=--config=/var/lib/kubelet/config.yaml"
# This is a file that "kubeadm init" and "kubeadm join" generates at runtime, populating the KUBELET_KUBEADM_ARGS variable dynamically
EnvironmentFile=-/var/lib/kubelet/kubeadm-flags.env
# This is a file that the user can use for overrides of the kubelet args as a last resort. Preferably, the user should use
# the .NodeRegistration.KubeletExtraArgs object in the configuration files instead. KUBELET_EXTRA_ARGS should be sourced from this file.
EnvironmentFile=-/etc/default/kubelet
ExecStart=
ExecStart=/usr/bin/kubelet $KUBELET_KUBECONFIG_ARGS $KUBELET_CONFIG_ARGS $KUBELET_KUBEADM_ARGS $KUBELET_EXTRA_ARGS
```
สร้าง containerd configuration
```
cd /etc/containerd/
sudo containerd config default > /etc/containerd/config.toml
```
แก้ไข containerd configuration ดังนี้
```
root = "/data/containerd/var"
state = "/data/containerd/run"
```
จากนั้น start containerd และ kubelet กลับมา
```
sudo systemctl start containerd
sudo systemctl start kubelet
```
กลับไปที่ control plane แล้วเอา nodeX กลับมาทำงานเหมือนเดิม
```
sudo kubectl uncordon nodeX
```
ตรวจสอบ nodeX ว่า DiskPressure มีสถานะเป็น KubeletHasNoDiskPressure หรือไม่ด้วยคำสั่ง
```
sudo kubectl describe node nodeX
```
ถ้าเรียบร้อยแล้ว กลับไปที่ nodeX เพื่อลบข้อมูลเก่า (เฉพาะ /var/lib/containerd/)
```
sudo rm -rf /var/lib/containerd/*
```
จากนั้นวนทำทีละ node จนครบ
March 18, 2024
[บันทึกกันลืม] วิธีหาไฟล์ที่มีคำที่ต้องการ โดยแสดงเฉพาะ ชื่อไฟล์เท่านั้น

kanakorn.h

February 6, 2024
โจทย์ ต้องการหาไฟล์ที่มีคำว่า text_pattern จากทุกไฟล์ในไดเรคทอรี่ /path/to/ ที่มีนามสกุล .ipynb แต่ต้องไม่มีคำว่า checkpoint ใน path หรือชื่อไฟล์
```
find /path/to -name "*.ipynb" ! -name "*checkpoint*" -type f -print0 | while read -r -d '' i; do
    if grep -q "text_pattern" "$i"; then
         echo "$i"  # Output only the filename if the pattern is found
    fi;
done
```
February 6, 2024
[บันทึกกันลืม] แก้ปัญหา rejoin node rke2 ไม่ได้

kanakorn.h

November 13, 2023
ปัญหา

node หนึ่งใน Rancher ใช้งานได้ตามปรกติ แต่ไป sudo apt update; sudo apt upgrade แล้วเกิดเหตุให้ ต้อง restart node ปัญหาที่เกิดขึ้นคือ pod ที่มาสร้างบน node นี้ไม่สามารถติดต่อกับระบบได้เลย

อาการ

pod จะ CrashLoopBackOff ตลอด หรือถ้าดู event จะเห็น timeout ตลอดครับ

วิธีแก้ไข

ได้ลอง format เครื่องก็แล้ว ทำหลายอย่างแล้วก็ไม่หาย ขอบคุณ คุณธนกร กิจศรีนภดล (เทียน) ได้ไปค้นหาวิธีการแก้ไขมาให้ โดยเหตุมาจาก Kernel ของ Ubuntu 20.04 กับ kernel ของ rke2 รุ่นที่ใช้งานอยู่ มี Bug เรื่อง UDP ตาม Link นี้
- RKE2 Cluster running Calico seemingly losing UDP traffic when transiting through service IP to remotely located pod · Issue #1541 · rancher/rke2 (github.com)
- k3s on rhel 8 network/dns probleme and metrics not work #5013
Root cause คือ: kernel bug affects udp + vxlan when using the offloading feature of the kernel

สรุปคือ ใช้คำสั่งนี้
```
sudo ethtool -K flannel.1 tx-checksum-ip-generic off
```
ผลการแก้ไข

หายสนิท ใช้งานได้ต่อไป
November 13, 2023

Author: kanakorn.h

ติดตั้ง KVM บน Ubuntu 22.04

ติดตั้ง Cockpit

สร้าง VM

ใช้งาน cockpit