# SMS XML Search & Bookmarking Tool

A production-ready web application for searching, managing, and bookmarking large SMS XML log files with full-text search, saved searches, and comprehensive tagging.

**Features:**
- 🚀 Upload and index SMS XML files up to 1GB
- 🔍 Multiple search modes: exact, regex, fuzzy, relevance (FTS5)
- 📊 Real-time indexing progress tracking
- 💾 Saved searches and search history
- 🔖 Bookmarks with custom tags
- 📈 CSV export of results
- ⚡ Background job processing with BullMQ
- 🔐 Production-ready security
- 📱 Modern SPA interface with React + Vite

---

## System Requirements

- **OS:** Ubuntu Server 24.04 LTS
- **RAM:** 64GB (16GB minimum)
- **Disk:** SSD recommended for database performance
- **Node.js:** v18+ (v20 recommended)
- **Redis:** 6.0+ (required for background jobs)
- **Apache2:** Already configured with SSL/vhosts
- **Disk Space:** 
  - App: ~500MB
  - Database: Depends on XML file size (1GB SMS file → ~500MB database)
  - Uploads: Same as XML file size

---

## Installation

### Prerequisites

```bash
# Update system packages
sudo apt-get update && sudo apt-get upgrade -y

# Install Node.js (if not already installed)
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt-get install -y nodejs

# Install Redis
sudo apt-get install -y redis-server
sudo systemctl enable redis-server
sudo systemctl start redis-server

# Install Git
sudo apt-get install -y git

# Verify installations
node --version  # Should be v18+
npm --version
redis-cli --version
```

### Automated Installation

1. **Extract the project ZIP:**
   ```bash
   cd /tmp
   unzip sms-xml-search.zip
   cd sms-xml-search
   ```

2. **Make install script executable:**
   ```bash
   chmod +x install_sms_xml_tool.sh
   ```

3. **Run installation as root:**
   ```bash
   sudo ./install_sms_xml_tool.sh
   ```
   
   This script will:
   - Create directory structure at `/mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/`
   - Install Node.js dependencies
   - Initialize SQLite database with FTS5
   - Create systemd services
   - Configure environment variables
   - Print Apache configuration snippets

4. **Update Apache configuration:**

   Edit both SSL and non-SSL vhost files:
   ```bash
   sudo nano /etc/apache2/sites-available/code.jetlifecdn.com.conf
   sudo nano /etc/apache2/sites-available/code.jetlifecdn.com-le-ssl.conf
   ```

   Add this inside the `<VirtualHost>` block (provided by installer):
   ```apache
   # SMS XML Search Tool
   <Directory /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms>
       Options -Indexes +FollowSymLinks
       AllowOverride All
       Require all granted
       
       <IfModule mod_rewrite.c>
           RewriteEngine On
           RewriteBase /xml-tools/sms/
           RewriteCond %{REQUEST_FILENAME} !-f
           RewriteCond %{REQUEST_FILENAME} !-d
           RewriteRule ^api/(.*)$ - [L]
           RewriteRule . /xml-tools/sms/index.html [L]
       </IfModule>
   </Directory>
   
   ProxyPreserveHost On
   ProxyPass /xml-tools/sms/api/ http://127.0.0.1:3000/ timeout=300
   ProxyPassReverse /xml-tools/sms/api/ http://127.0.0.1:3000/
   
   RewriteEngine On
   RewriteCond %{HTTP:Upgrade} websocket [NC]
   RewriteCond %{HTTP:Connection} upgrade [NC]
   RewriteRule ^/xml-tools/sms/api/(.*)$ "ws://127.0.0.1:3000/$1" [P,L]
   ```

5. **Enable Apache modules:**
   ```bash
   sudo a2enmod proxy
   sudo a2enmod proxy_http
   sudo a2enmod proxy_wstunnel
   sudo a2enmod rewrite
   sudo a2enmod headers
   ```

6. **Test Apache configuration:**
   ```bash
   sudo apache2ctl configtest
   # Should output: Syntax OK
   ```

7. **Reload Apache:**
   ```bash
   sudo systemctl reload apache2
   ```

8. **Start application services:**
   ```bash
   sudo systemctl start sms-xml-backend
   sudo systemctl start sms-xml-worker
   sudo systemctl enable sms-xml-backend
   sudo systemctl enable sms-xml-worker
   ```

9. **Verify services are running:**
   ```bash
   sudo systemctl status sms-xml-backend
   sudo systemctl status sms-xml-worker
   sudo systemctl status redis-server
   ```

---

## Usage

### Web Interface

Open your browser and navigate to:
```
https://code.jetlifecdn.com/xml-tools/sms/
```

### Uploading SMS XML Files

1. Navigate to **Upload** section
2. Click "Select File" and choose your SMS XML file (up to 1GB)
3. Click "Upload" to start
4. Monitor progress bar showing:
   - Upload percentage (chunked uploads)
   - Parsing progress
   - Messages processed / total
5. Once complete, file is indexed and searchable

### Searching

1. Go to **Search** tab
2. Select search mode from dropdown:
   - **Exact Match:** Find exact message text
   - **Regex:** Use regular expressions (e.g., `\d{3}-\d{4}`)
   - **Fuzzy:** Approximate matching for typos
   - **Relevance:** FTS5 ranked full-text search (best for natural language)
3. Enter search query
4. (Optional) Apply filters:
   - Date range
   - Contact/address
   - Thread ID
   - Direction (sent/received)
5. Click "Search" to execute

### Search Results

- View message preview with timestamp and contact
- Click "Show Context" to see N messages before/after in same thread
- Click bookmark icon to save message
- Apply tags to organize bookmarks
- Re-run saved searches from dashboard

### Managing Bookmarks

1. Click bookmark icon on any message
2. Add notes (optional)
3. Assign one or more tags
4. View all bookmarks in **Dashboard** → **Bookmarks**
5. Filter by tag or search by content
6. Click to view message in context
7. Edit or delete bookmark

### Tags

1. Go to **Dashboard** → **Tags**
2. Create new tags with custom colors
3. Assign to bookmarks
4. View count of tagged items
5. Rename or delete tags
6. Tag counts update automatically

### Search History & Saved Searches

**History:**
- Automatic: Every search is logged
- View in **Dashboard** → **Search History**
- Re-run any previous search
- Oldest searches auto-removed after 1000 entries

**Saved Searches:**
- Click "Save Search" on results to save query with name
- View in **Dashboard** → **Saved Searches**
- Run again with one click
- Edit or delete saved searches

### Export Results

1. Execute a search
2. Click "Export to CSV"
3. Choose format:
   - Current results only
   - Include context messages
   - All messages in date range
4. Download CSV file
5. Open in Excel, Google Sheets, or Python

---

## Configuration

### Backend Configuration

Edit `/mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/backend/.env`:

```bash
# Server settings
NODE_ENV=production
PORT=3000
HOST=127.0.0.1

# Database
DATABASE_PATH=/mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db
UPLOAD_DIR=/mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/uploads
MAX_UPLOAD_SIZE=1073741824  # 1GB in bytes

# Redis
REDIS_HOST=127.0.0.1
REDIS_PORT=6379

# Background worker
WORKER_CONCURRENCY=2
JOB_TIMEOUT=300000  # 5 minutes in milliseconds

# Logging
LOG_LEVEL=info
LOG_FILE=/mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/logs/backend.log

# Security
CORS_ORIGIN=https://code.jetlifecdn.com
SESSION_SECRET=<auto-generated>
```

### Frontend Configuration

Edit `/mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/frontend/.env`:

```bash
VITE_API_URL=https://code.jetlifecdn.com/xml-tools/sms/api/
VITE_PUBLIC_PATH=/xml-tools/sms/
```

---

## Service Management

### Start Services

```bash
# Start backend API
sudo systemctl start sms-xml-backend

# Start background worker
sudo systemctl start sms-xml-worker

# Start Redis
sudo systemctl start redis-server

# Restart all
sudo systemctl restart sms-xml-backend sms-xml-worker redis-server
```

### Stop Services

```bash
sudo systemctl stop sms-xml-backend sms-xml-worker
```

### Enable Auto-Start on Reboot

```bash
sudo systemctl enable sms-xml-backend sms-xml-worker redis-server
```

### Check Service Status

```bash
sudo systemctl status sms-xml-backend
sudo systemctl status sms-xml-worker
sudo systemctl status redis-server
```

### View Logs

```bash
# Backend logs
tail -f /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/logs/backend.log

# Worker logs
tail -f /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/logs/worker.log

# System logs
journalctl -u sms-xml-backend -f
journalctl -u sms-xml-worker -f

# Redis logs
redis-cli INFO stats
```

---

## Database Management

### Re-index All Messages

```bash
cd /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/backend

# Clear existing index and rebuild FTS5
npm run reinit-fts5
```

### Backup Database

```bash
# Create backup
sqlite3 /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db ".backup '/mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db.backup'"

# Or with compression
cp /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db /path/to/backup/sms-$(date +%Y%m%d).db
gzip /path/to/backup/sms-$(date +%Y%m%d).db
```

### Restore from Backup

```bash
# Stop services
sudo systemctl stop sms-xml-backend sms-xml-worker

# Restore
cp /path/to/backup/sms.db.backup /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db

# Fix permissions
sudo chown www-data:www-data /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db

# Start services
sudo systemctl start sms-xml-backend sms-xml-worker
```

### Check Database Size

```bash
ls -lh /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db
du -sh /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/

# Query message count
sqlite3 /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db \
  "SELECT COUNT(*) as total_messages FROM messages;"
```

---

## Development

### Local Development Setup

```bash
# Clone/extract project
cd sms-xml-search

# Backend development
cd backend
npm install
cp .env.example .env
# Edit .env with local paths
npm run dev

# In another terminal, start worker
npm run worker

# Frontend development (another terminal)
cd ../frontend
npm install
npm run dev
# Opens http://localhost:5173
```

### Development Environment Variables

**Backend `.env` (dev):**
```bash
NODE_ENV=development
PORT=3000
HOST=localhost
DATABASE_PATH=./databases/sms.db
UPLOAD_DIR=./uploads
LOG_LEVEL=debug
```

**Frontend `.env` (dev):**
```bash
VITE_API_URL=http://localhost:3000/
VITE_PUBLIC_PATH=/
```

### Running Tests

```bash
# Backend tests
cd backend
npm test

# Frontend tests
cd ../frontend
npm test
```

---

## Troubleshooting

### Services Won't Start

**Check system logs:**
```bash
journalctl -xe
journalctl -u sms-xml-backend -n 50
```

**Common causes:**
- Redis not running: `sudo systemctl start redis-server`
- Port 3000 in use: `lsof -i :3000`
- Database locked: Stop services, delete `.db-shm` and `.db-wal`, restart
- Permission errors: `sudo chown -R www-data:www-data /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/`

### Upload Fails

**Check disk space:**
```bash
df -h /mnt/media_drive2/
```

**Verify permissions:**
```bash
ls -la /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/uploads/
```

**Clear stuck uploads:**
```bash
rm -f /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/uploads/*.partial
```

### Search is Slow

**Check database health:**
```bash
sqlite3 /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db
> PRAGMA integrity_check;
> PRAGMA optimize;
```

**Optimize FTS5:**
```bash
sqlite3 /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db \
  "INSERT INTO messages_fts(messages_fts) VALUES('optimize');"
```

### Redis Connection Issues

**Check Redis status:**
```bash
redis-cli ping
# Should return: PONG

redis-cli INFO server
```

**Restart Redis:**
```bash
sudo systemctl restart redis-server
sudo systemctl status redis-server
```

### Apache Returns 502/503

**Verify backend running:**
```bash
sudo systemctl status sms-xml-backend
curl http://127.0.0.1:3000/api/health
```

**Check Apache error log:**
```bash
tail -50 /var/log/apache2/error.log
tail -50 /var/log/apache2/access.log
```

**Test Apache proxy:**
```bash
curl -v https://code.jetlifecdn.com/xml-tools/sms/api/health
```

### Workers Not Processing Jobs

**Check Redis queue:**
```bash
redis-cli
> KEYS bq:*
> LLEN bq:xmlindexer:active
```

**Restart worker:**
```bash
sudo systemctl restart sms-xml-worker
```

**Monitor worker:**
```bash
tail -f /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/logs/worker.log
```

---

## Performance Tuning

### SQLite Optimization

```bash
sqlite3 /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db << EOF
PRAGMA journal_mode = WAL;
PRAGMA synchronous = NORMAL;
PRAGMA cache_size = -64000;
PRAGMA foreign_keys = ON;
PRAGMA temp_store = MEMORY;
PRAGMA mmap_size = 30000000000;
PRAGMA page_size = 4096;
INSERT INTO messages_fts(messages_fts) VALUES('optimize');
EOF
```

### Redis Optimization

Edit `/etc/redis/redis.conf`:
```bash
# Memory management
maxmemory 4gb
maxmemory-policy allkeys-lru

# Persistence (optional)
save 900 1
save 300 10
appendonly yes
```

### Node.js Optimization

Edit systemd service file:
```bash
sudo nano /etc/systemd/system/sms-xml-backend.service
```

Update `Environment` line:
```bash
Environment="NODE_OPTIONS=--max-old-space-size=4096 --enable-source-maps"
```

---

## Security Considerations

### File Upload Security

- ✅ Files validated as valid XML before processing
- ✅ Uploaded files stored outside web root
- ✅ File size limited to 1GB via `MAX_UPLOAD_SIZE`
- ✅ MIME type validation
- ✅ Filename sanitization

### SQL Injection Prevention

- ✅ All queries use parameterized statements
- ✅ Input validation with Joi schemas
- ✅ No string concatenation in SQL

### XSS Prevention

- ✅ All output HTML-escaped
- ✅ React auto-escapes by default
- ✅ No `dangerouslySetInnerHTML` usage
- ✅ Content Security Policy headers

### CSRF Protection

- ✅ Stateless API (tokens not required for upload)
- ✅ CORS configured to trusted domain only
- ✅ SameSite cookies enforced

### Authentication

- ⚠️ Currently no auth (assumes trusted network)
- 🔧 Easy to add: see `backend/src/middleware/auth.js`
- Recommended for production with external access

### Backup Security

```bash
# Encrypt backups
gpg -c /path/to/backup/sms.db.backup
# Store gpg key separately!

# Or use encrypted storage
sudo apt-get install ecryptfs-utils
```

---

## Monitoring & Maintenance

### Regular Backups

```bash
# Create automated backup script
cat > /usr/local/bin/backup-sms-db.sh << 'EOF'
#!/bin/bash
BACKUP_DIR=/mnt/backups/sms-search
mkdir -p $BACKUP_DIR
sqlite3 /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db \
  ".backup '$BACKUP_DIR/sms-$(date +%Y%m%d-%H%M%S).db'"
# Keep last 30 days
find $BACKUP_DIR -name "sms-*.db" -mtime +30 -delete
EOF

chmod +x /usr/local/bin/backup-sms-db.sh

# Add to crontab
sudo crontab -e
# Add: 0 2 * * * /usr/local/bin/backup-sms-db.sh
```

### Health Checks

```bash
#!/bin/bash
# health-check.sh
curl -f http://127.0.0.1:3000/api/health || exit 1
redis-cli ping | grep -q PONG || exit 1
systemctl is-active sms-xml-backend > /dev/null || exit 1
systemctl is-active sms-xml-worker > /dev/null || exit 1
echo "All systems healthy"
```

### Monitoring Metrics

```bash
# Message count
sqlite3 /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/sms.db \
  "SELECT COUNT(*) FROM messages;"

# Database size
du -sh /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/databases/

# Uploads
ls -la /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/uploads/

# Memory usage
ps aux | grep "node src"
```

---

## API Reference

### Health Check
```bash
GET /api/health
```

### Upload
```bash
POST /api/upload/start
POST /api/upload/chunk
POST /api/upload/complete

# Progress via WebSocket
WS /api/upload/progress/{uploadId}
```

### Search
```bash
POST /api/search
  {
    "query": "hello",
    "mode": "exact|regex|fuzzy|relevance",
    "filters": {
      "dateRange": [startMs, endMs],
      "address": "1234567890",
      "threadId": "123",
      "direction": "in|out"
    }
  }
```

### Bookmarks
```bash
GET /api/bookmarks
POST /api/bookmarks
GET /api/bookmarks/{id}
PUT /api/bookmarks/{id}
DELETE /api/bookmarks/{id}
```

### Tags
```bash
GET /api/tags
POST /api/tags
PUT /api/tags/{id}
DELETE /api/tags/{id}
```

### Export
```bash
POST /api/export/csv
  {
    "searchQuery": "...",
    "mode": "exact|regex|fuzzy|relevance",
    "filters": {...}
  }
```

---

## Architecture Overview

### Backend Stack
- **Framework:** Express.js
- **Database:** SQLite3 with FTS5 full-text search
- **Queue:** BullMQ (Redis-backed job queue)
- **Server:** Node.js with systemd integration
- **Logging:** Winston

### Frontend Stack
- **Framework:** React 18 with Vite
- **Routing:** React Router v6
- **State:** React Hooks + Context API
- **Styling:** Tailwind CSS
- **HTTP Client:** Fetch API with custom hooks

### Data Flow
```
Upload File
  ↓
Backend receives chunks
  ↓
Validates XML structure
  ↓
Enqueues indexing job
  ↓
Worker processes via streaming SAX parser
  ↓
Inserts into SQLite + FTS5 index
  ↓
Progress updates via WebSocket
  ↓
Frontend updates progress bar
  ↓
Indexed and searchable
```

---

## License & Support

This project is provided as-is for your use. 

For issues or questions:
1. Check logs: `tail -f /mnt/media_drive2/site-root/code.jetlifecdn.com/xml-tools/sms/logs/*.log`
2. Review troubleshooting section above
3. Check service status: `systemctl status sms-xml-*`
4. Verify Apache config: `apache2ctl configtest`

---

**Last Updated:** December 24, 2025  
**Version:** 1.0.0  
**Target:** Ubuntu 24.04 LTS with Apache2
