
|
View Full Version : United Kingdom Relevance
abbas 03-18-2001, 08:10 PM Hello all.
It seems that most of the stuff on this board is relevant to the States, so could someone help me here back at United Kingdom.
I would like to do the following:
>>Create a search engine, like http://www.google.com (yep, a spider bot).
>>Provide webspace, i.e. individual users & resellers.
>>Provide specialist solutions.
So, my chosice are as follows:
>>Use a leased line and create my own data centre.
>>Use someone elses backbone and co-locate my servers.
>>Use some reseller package.
>>Get some dedicated servers.
Bearing in mind the amount of webspace and bandwidth I require, what would you recommend?
I mean, bandwidth costs will be high if a use someone elses?
Please could someone shed some light (even a matchstick).
astralexis 03-19-2001, 08:39 PM Hi,
Interesting idea, to make a search engine... Just curious, are you going to create a an index for the whole internet, like google? How much traffic do you expect this bot to create? I guess also the index itself will be huge...
Just don't forget to check for patents, as far as I know especially Google has patents on some internet search engine stuff.
Hey, just get one of those "unlimited" accounts, and you can house the whole thing for a few pounds a month. I've never understood why google and yahoo don't do that! :)
Seriously, that's a pretty varied product line. And a pretty tall order, if you're talking about developing your own search engine technology. In any case, if I were to do it (which I wouldn't) I wouldn't necessarily house all of those projects in the same place. You might be better off to find the best individual deal for each of them; perhaps outsource some of it and run some in your own NOC, or maybe end up placing it all in different places. I guess my point is, don't necessarily think you need just one answer.
GordonH 03-20-2001, 06:23 AM Hello
Most companies doing that would colocate at somewhere like Redbus Interhouse.
Telehouse is actually full at the moment and its also a lot more expensive.
Of course you could host in the US.
Thats what Yahoo do for their UK directory.
Gordon
abbas 03-20-2001, 07:24 AM Hey,
Thanks for all your posts.
Ok, JayC>> I will buy the search engine software, so I do not need to develop my own.
I was thinking of outsourcing the hosting part, but where would I outsource the search engine part to?
Would it be better though for me to create my own NOC and run everything from it, it would be a lot easier to manage?
And finally, email a few of those "unlimited" company providers, and you will find out just how "unlimited" it really is.
Thanks.
astra4>>Well, I was hoping the search engine will work on its own and create the index. I expect there to be a lot of traffic in terms of the bot spidering, (don't know about people visiting!). The index will be big. And thanks for the patent info, I will check it out.
Thanks.
GordonH>>Yep, UK is ALOT more expensive then the US. Know of any places that would be suitable in the US?
Thanks a lot you guys, you are giving me some ideas to think about.
GordonH 03-27-2001, 06:05 AM Hello Abbas
Thanks for the e-mail.
I am out of the office so replying here.
<disclaimer>
we do not sell dedicated servers so we have no commercial interest in this.
</disclaimer>
You really need a dedicated server for this project.
What you go for will depend on the software you are going to use.
Assuming its Linux based then you have a huge choice, but you might want to consider having more than one server in different data centers using load balancing so that if one data center goes down (sorry VDI, but it still stings) your service will remain active.
This is an expensive project.
We have dedicated servers from 2 companies (soon to be 4).
The best so far for support is Burst.net.
We just ordered a big server from Venturesonline.com and have another coming online with telaserv.com shortly.
Gordon
abbas 03-27-2001, 06:25 AM Thank you GordonH.
I have been talking to some hosts in the us, and they have all said a dedicated server would be the best solution.
In the uk, uk2.net offer a raq for £25 a month, and I am just checking them out. They do not offer support via telephone OR email, so they may not be the right choice.
You say http://www.burst.net/ is the best for support, have you found anyone that is the best for price?
http://www.venturesonline.com/ is very expensive, and I do know that this project will be expensive, but it would be nice to cut it down to a minimum.
Thanks.
Regards.
GordonH 03-27-2001, 06:40 AM The UK2 RAQ is a NO NO.
You have to pay 3 years up front and the servers do not have enough memory.
Venturesonline is not that expensive really.
About the same price as Burst.net. I can't comment on their support because my server is still just a box of components.
I can spread the machines about because I have a few stand alone projects. It may be madness doing it this way but it means I don't have all my eggs in one basket.
If you need a cheap RAQ 4webspace.com do them at $200 for the RAQ4.
Gordon
abbas 03-27-2001, 09:29 AM Hi,
Thanks.
Well, http://www.4webspace.com/ looks good, but it is a raq. I will require at least linux so that I can install server software (things like mail servers/pgp encryption) etc..
The search script that I will use is a cgi script using a mySQL database. Does raq support this?
>>>>>I just checked, mySQL is supported by raq, but at a charge of $159US from 4webspace.com. I don't know if it is a one off charge or a monthly charge. I shall look into it.
In fact, they charge for everything extra, e.g. mySQL, PHP, ASP, and Chilisoft SpicePack.
How much will a raq support in terms of email and web hits?
I am looking for something a little near to home (United Kingdom), as my target audience is here, and they will want speeeeeed?
Thanks.
Regards.
[Edited by abbas on 03-27-2001 at 08:37 AM]
Chicken 03-27-2001, 11:01 AM This is just general info about RaQ3's. They do not come with the things you mentioned pre-installed, but you can install them yourself (some things easily, some things more difficult, and ChiliASP, carries quite a license fee, and there's a mySQL pkg that is easy to install). The charges listed on their site are one-time installation charges, I do know that.
RaQ4's have ChiliASP installed already, but anyway...
astralexis 03-27-2001, 11:14 AM I wonder if a distinction is to be made between incoming and outgoing traffic.
I mean, bandwidth cost is normally understood for outgoing internet traffic, which is the normal thing for a web host. I even remember having seen hosts who said they have no limitation as for incoming traffic...
Does one need to discuss a web spider project with the ISP before signing up for a server, so they know what kind (incoming) of bandwidth usage will be made?
Just a quick point in case you don't already realise how expensive (in terms of bandwidth) this is really likely to be. If you do actually realise it, please accept my apologies for thinking you don't.
Before you start looking for some place to host your site, it is probably a good idea to find out exactly what sort of resources you'll require.
If you actually intend to set up a webspider that crawls the entire web (or as much as you can find links to and referers from anyway), it would be a good idea to know just how much bandwidth this will consume. Do you know? I don't, but I'm sure it's a lot. The Internet Archive (http://www.archive.org/internet/about.html#Web_Robots) says that Alexa's web spiders gather over 100gigabytes a day. That's over 30TB a month. This is not going to be accomplished with a single low cost server, dedicated or otherwise. In fact, 30TB will consume two full DS3's, so you can expect that amount of bandwidth should cost you at least USD50,000 a month, and that's using a cheapo provider in the states. There's also the matter of serving web pages to the visitors, and the amount of bandwidth that would consume.
As for the type of servers needed, suffice to say that google has thousands of computers (6000 of which are using eepro100 network cards, which is where i got my figures from). This will probably fill up redbus interhouse on its own.
I suspect that if you really want to be doing this, you may be better off getting your own leased lines, NOC, etc. You will probably also have to develop in-house expertise to manage all of this. Simply buying a web spider script is unlikely to give you the customizations you require (after all, what good would it do you to be a google that came up later? You might as well do some kind of co-branding deal with google.
astralexis 03-27-2001, 12:18 PM You might as well do some kind of co-branding deal with google
Good point. I remember reading on google.com they are very open for cooperations if someone has a good idea. You can use their index and share your revenue with them...
abbas 03-27-2001, 06:50 PM Thanks 4 your posts guys.
I was not going to generate "that" much bandwidh. I was going to limit the amount of the net the spider indexed.
Thank you though with the idea of branding via google.
The thing is, their site is verrrrry fast, AND they have a lot of sites indexed. If they use lots of computers working in parallel, then the search will be very slow?
Anyway, I was think of indexing the net for a few months, i.e. use the entire bandwidth, and then start to advertise the website.
I was looking at dell, and to higher a server off them (one with rough 750GB space) on lease costs less then £1000 or $1500 per month. I could co-locate it on someone elses server?
Building a noc is very expensive in the UK. Getting a 64K leased line (unlimited bandwidth) costs around £6000 or $9000 per year. A 2MB leased line costs around £20K or $30K a year. See how much more expensive it is here!
Anyway, I will contact google, and let you know what happens (you might want to follow the same project!).
P.S Thanks JayC for the mail. I understand.
Thanks a lot.
Regards.
[Edited by abbas on 03-27-2001 at 05:57 PM]
Originally posted by abbas
I was not going to generate "that" much bandwidh. I was going to limit the amount of the net the spider indexed.
That I don't understand. Of what value would be an index that doesn't even attempt to include the whole of the web, when there already are several that do. Do attempt, I mean, none of them can even come close. If you're setting up right away to store less data than your competitors, how do you compete?
|