Products Downloads Prices Support Company
Index FAQ Configs Feeds In Feeds Out Feeds Out  

DNews - Performance Issues - Large Systems

Performance related configuration problems/issues

Linux: /sbin/mke2fs -b 4096 -v /dev/sdc1 (device name)
Solaris: newfs -b 8192 -f 8192 -o time DEVICE_NAME
NT: format d: /q /a:32K

Here are some config settings that are a good idea on a big system with a reasonable amount of RAM

	body_chunk 100000
	big_size 100000
	xover_cache 10000
	xover_write_cache 5000
	thread_in true (if your system and DNews supports it, use dnews5.4j1 or later)

How to build a system to support 1000 concurrent users

For 1000 concurrent 'typical' users (as described below) we recommend hardware of approximately these specs:
    Ultra Wide SCSI -2 or better disks, two controllers
    5-10 8gig disks
    500MB RAM
    Pentium 500Mhz Processor or better.
    100mbit network card
Use dmulti with slave_n setting of 4-6
Format disks with large blocking facters, e.g. 32K bytes if possible

For OS we would recommend NT, Linux, Solaris. Although any of the unix versions should be fine as long as you can increase file handle limits to at least 2000 handles.  DNews 5.3 or later is required.

For larger systems contact our consulting experts for advice.

How to build a system to support 300+gig of spooled news

1) Place high 'volume' groups into smaller piles to avoid getting 'huge' news groups
2) Add memory, we recommend at least 1000MB of Ram, many people increase spool size and forget to add memory at the same 'rate/time' this often leads to a throttled system as the indexes all grow and need RAM to compensate.
3) Avoid letting your history file grow over 2gig, to do this see point 1 above.
4) Set bucket_size 100000000, and use at least 10 spool areas not just one huge
disk array.
5) Don't create one huge raid partition with spool and history and workarea, create small partitions for history and workarea and keep them separate from the spool area, and it's probably a good idea to split the spool area into 4-8 partitions as well.

Dmulti, when to use it

If your site is going to have more than 200 concurrent users and /or many users will be reading news 24 hours a day (while the expire is running) then you might want to configure DMULTI.  DMULTI works by splitting DNews into several processes to make the  best use of your system resources, one process (s0) handles the incoming news feed, and the other processes ( handle users reading news.

DMULTI only gives real gains when you have more than one disk and some free RAM, see this page for details on confinguring dmulti.

Performance system used for measurements

For our test system we used a Pentium 300, single processor, with 190MB RAM, the spool was made up of 5 8gig ultra wide SCSI U2W disks on an adaptec 2940U2W controller and an intel 100Mbit ethernet card.  A small but respectable system for taking a full news feed, we estimate on this system you could run 1000 concurrent users fairly comfortably although it would be best to add a little more RAM, e.g. 384MB.   The actual output it could handle was 3mb/second of article/xover output.  It was 'cpu' bound so adding processors would increase the throughput above this level.

Typical user, how we arrived at these figures

To measure a typical user we simply averaged the use of 200 real users on a very large over powered news server.  These users were 'unrestricted' in what they could do and some may have been pulling partial news feeds from the system, so if you limited users to 'reasonable' volumes of news it is likely you could handle two or three times as many users as our estimates predict. 

Hence our typical user was connected for 35 minutes, downloaded 5mb of data, read 130 items, read 1150 xover records and perused 30 news groups. 

Please note this does not mean each of the users on a system designed as above to take 1000 users would  be limited to the above figures, but rather that the users as a group would do that much on average, many users may use 10 times these resources while others would use much less. 

Scaling to multiple systems

Readonly Servers

In this mode, you add a second box, which reads the same spool directories, this lets you randomly send users between 2-3 systems because all the systems have identical article numbers, and the system can never be out of sync as it's reading the same files.   However, its disadvantages are that you are adding load to the same original disk system and network either of which may be the real bottleneck so you may not increase real throughput much.  Also this gives you no backup in the event of a disk failure.   See here for more details on readonly configuration

Multiple systems

This is the best option in terms of performance, management and reliability in our opinion.   However you can't move users randomly between the systems because item numbers will not match exactly.  We strongly recommend this option.

Replicating servers

This is just like multiple systems but with the addition that item numbers do match perfectly, we strongly recommend against this option, the problem with it is that keeping the numbering matching perfectly is very difficult, and if it ever goes out of sync due to a fault on one system or a disk crash, re-syncing the servers is a real pain.  Avoid this configuration unless your boss/customer  insists on it :-), click here for details on configuring replication

Tuning parameters

If your stats shows big_add is using a large amount of time and you have lots of RAM then try some settings like this:

body_chunk 200000
big_size 200000
xover_write_cache 5000
index_nbuffer 1000