Open Source Web Content Filtering Project
From Kathmann Labs
Contents |
Open Source Web Content Filtering Project
Add this page to your De.lico.us bookmarks
- A web content filtering system is something that sits between the client computers and the internet, and restricts access to certain websites based on criteria. It can either be inline, where the requests are sent directly through the content management system, or passive, where the content management listens in "promiscuous" mode for those certain sites. If a site that is deemed unacceptable is found, the content management system sends a TCP RST packet to the other end, and hijacks the web session to redirect it to the block page. Content Management Systems can be used by parents or administrators to block innapropriate sites to minors, or to only allow certain sites deemed acceptable.
- The concept behind this project is to build a web content filtering server with a small footprint that can be installed in a VM or on a small computer (like the Mini-ITX systems shown on this site. Parents can then filter what their children see, and keep logs on what they were trying to get to. Another possibility is to include a wireless card into the design on the system, and have the system act as a DHCP, NAT, Firewall, Wireless AP, and Router along with the Web Content Filtering. This would allow home or SMB users to place a single device on their network to handle most of the functions of a wireless router with content filtering. The system can act as a transparent proxy (if it's on an inline layer 3 device), as a reporting only (not blocking), or just blocking the sites is sees.
- Just added SARG, a logging utility used in conjunction with Squid cache and DansGuardian. With this utility, you can view what your kids or computers behind your Web Content Filter. NOTE: I've found out recently that when a large number of users are being logged with SARG the disk space can build up very quickly in the /var/www/sarg folder. See here for a script that will delete files older than 7 days. Delete Older Than Script
- NOTE: Dansguardian is free for home or personal use, and limited use within Government agencies. It is NOT however free for business use. You can however use it in a business setting for a very small fee. Please see DansGuardian link below.
Live Demos
- DansGuardian - Set your proxy server to the following address to test the system. Server = kathmann.dyndns.org Port = 8080 For instructions on how to change your proxy settings, please see the instructions below.
Expertise
- 2 out of 5
- Basic linux administration knowledge
- Basic knowledge of TCP/IP & HTTP
Hardware
- Mini-Box M300
- Dell Poweredge SC440 - (virtualized)
Software
Web Links
How-tos
These stay the same regardless
- make changes to the proxy settings on the client machines
- tweak away
- To help speed up the DNS resolutions you can also add Bind to cache DNS requests locally. Follow the instructions in the Bind DNS Internet Caching Server Project to add this.
- You will then need to change the nameservers on your local machine to point to itself.
- vi /etc/resolv.conf
- Change the first nameserver statement to nameserver 127.0.0.1
VMWare Virtual Appliances
- CentOS / Dansguardian / Squid / SARG VMWare Virtual Appliance - you will need to run "netconfig" on the VM once it's opened at the command line as the MAC address will have changed. The username is root, password kl-cent-contentfilter.

