Open Source Web Content Filtering Project

From Kathmann Labs

Jump to: navigation, search

Contents

Open Source Web Content Filtering Project

Add this page to your De.lico.us bookmarks

A web content filtering system is something that sits between the client computers and the internet, and restricts access to certain websites based on criteria. It can either be inline, where the requests are sent directly through the content management system, or passive, where the content management listens in "promiscuous" mode for those certain sites. If a site that is deemed unacceptable is found, the content management system sends a TCP RST packet to the other end, and hijacks the web session to redirect it to the block page. Content Management Systems can be used by parents or administrators to block innapropriate sites to minors, or to only allow certain sites deemed acceptable.
The concept behind this project is to build a web content filtering server with a small footprint that can be installed in a VM or on a small computer (like the Mini-ITX systems shown on this site. Parents can then filter what their children see, and keep logs on what they were trying to get to. Another possibility is to include a wireless card into the design on the system, and have the system act as a DHCP, NAT, Firewall, Wireless AP, and Router along with the Web Content Filtering. This would allow home or SMB users to place a single device on their network to handle most of the functions of a wireless router with content filtering. The system can act as a transparent proxy (if it's on an inline layer 3 device), as a reporting only (not blocking), or just blocking the sites is sees.
Just added SARG, a logging utility used in conjunction with Squid cache and DansGuardian. With this utility, you can view what your kids or computers behind your Web Content Filter. NOTE: I've found out recently that when a large number of users are being logged with SARG the disk space can build up very quickly in the /var/www/sarg folder. See here for a script that will delete files older than 7 days. Delete Older Than Script
NOTE: Dansguardian is free for home or personal use, and limited use within Government agencies. It is NOT however free for business use. You can however use it in a business setting for a very small fee. Please see DansGuardian link below.

Live Demos

DansGuardian - Set your proxy server to the following address to test the system. Server = kathmann.dyndns.org Port = 8080 For instructions on how to change your proxy settings, please see the instructions below.
SARG - Click here to see the SARG utility in action.

Expertise

2 out of 5
Basic linux administration knowledge
Basic knowledge of TCP/IP & HTTP

Hardware

Mini-Box M300
Dell Poweredge SC440 - (virtualized)

Software

linux
DansGuardian
Squid Cache
SARG

Web Links

Danguardian Web Page
Squid Cache Web Page
CentOS Web Page
SARG Web Page
Ubuntu Web Page

How-tos

Dansguardian Red Hat Linux Install how-to
Add logging to the above project with SARG
Dansguardian Ubuntu Linux Install how-to
Add logging to the above project with SARG

These stay the same regardless

  • You will then need to change the nameservers on your local machine to point to itself.
  • vi /etc/resolv.conf
  • Change the first nameserver statement to nameserver 127.0.0.1

VMWare Virtual Appliances

CentOS / Dansguardian / Squid / SARG VMWare Virtual Appliance - you will need to run "netconfig" on the VM once it's opened at the command line as the MAC address will have changed. The username is root, password kl-cent-contentfilter.
  • Setup now for 1 CPU, 384MB RAM (can be lower), 8GB HD (SCSI LSI Logic, can also be lower)
  • Cent OS 4.4 (2.6.9-42.0.10)
  • Dansguardian 2.8.0.6
  • Squid 2.5.STABLE6-3
  • SARG 2.2.1-1
  • Also running nptd and Bind
Personal tools
extras