I have been sidetracked by a lot of projects since the last post.
But let me share a project with you that uses the knowledge based on the last post.

I have been working on reworking / redesigning our current in-store music streaming system.
While it works well, its not really easily scalable.
Its based on a windows server containing our application services & mysql  + standard Linux web hosts for streaming the actual music.

The Issues here are:
mySQL is a security issue ( single point of failures, unless large and complicated deployments ) , bottleneck ( doesn't scale well with writes ) , data loss is likely  ( we had a few DB failures & corrupt tables over the years )
Standard web hosting required us to manually upload the files by FTP for each server once they where tagged and indexed.

The Goals Where:
  • High reliability - no single point of failure
  • Automatic Failover in case a server dies
  • No administration
  • Easely Scalable

Those who read Part1 of this series will know that any machine that is required to always be on should not be placed on azure / amazon AWS - you simply get too little for your money.
And since the bare-metal servers we already have are so powerful, they have lots of capacity to spare  - so the decision was made to run on our existing hardware from Hetzner and Leaseweb ( giving geo redudnancy), which really just where bored with their current jobs.

The Search For mySQL replacememt
After trying out pretty much all free DB's availible, i found just ONE(!) that actually works out of the box on windows, can deliver high availibility, doesnt require the servers to be on LAN, and didn't involve setting a lot of manual settings like name & IP adresses that will need to be updated each time a server is decomissioned / instanced.
Its also the only one which had a fully working web interface that didnt crumble during my testing abuse :)
The last candiate standing was: couchbase 
( although i did find a memory leak in couchbase if you abuse it like i did - they are looking into it )

So now we have 1 cluster with 2 * Quad core & 32GB RAM servers as a testing base, with a second cluster as backup & replication.
So what does that give us?
  • There is very little management - all needed actions & viewing current status can be done at any time from a good web interface.
  • Low Cost - you can use consumer hardware, and they don't need any special connections to work as a cluster.
    In this case i'm just using the RAM & CPU cycles that where free on the servers to begin with.

  • Couchbase automatically re-balances & replicates data between nodes, there is no master, so no single point of failure
  • Easily scalable - as the queries they get are load balanced between servers ( adding more servers -> quicker response times for ALL queries  )
all in all a good match, we have the database covered ( the issues i had with not being able to rely on SQL is another matter in itself - be ready to rethink how to query / structure the data )

How to handle streaming and administration of files
To have a no administration system you need to have a reliable backend.. we already have couchbase - how about using that as a file store?
Sadly that wont do - couchbase has a file limit of 20MB, so thats a no go.
The second solution would be to use amazon s3 or azure storage, and just let them handle the sharding and balancing.
while it works perfectly even in high stress situations - the high bandwidth pricing ( costs more to move file once than store it for a full month ) makes it a really bad deal.
So what i did was to create a hybrid approach:
The program that reads the music tags from files automatically splits, obfusciates and uploads them to azure storage ( might just as well be S3 )
Then i created a lightweight cache web server that really only does two things, when a file is requested - it will look in local cache.. if file is not there, request it from azure and store locally.. once that is done, start streaming.
Usually with a 4MB mp3 thats over in 1-2 seconds. ( while these servers have high upload load - the download bandwidth is free to get new cache files even under in peak hours. )

So what does that give us?
  • There is no managment - KISS
  • In case a server dies - all other can still handle any request - no single point of failure
  • Easely Scalable - just rent some new servers and copy paste a single exe.. its ready for usage.
  • As Azure only functions as backup replication service we can easily setup new servers & new files without having to worry about how many replicas are availible - and we don't pay the high bandwidth expenses for our normal usage.
    This can be easily be extended so servers try to get the files from each other before asking azure, reducing costs  for server instancing even further. 



 


Comments

11/15/2014 3:11am

Nice blog! Just one quick note, auto-failover requires three nodes in order to quorum correctly and avoid split brain. You probably are aware of that, but I thought I'd mention it when you said you have two nodes.

Reply
Lennart Berg
11/16/2014 3:25pm

Hi Matt
Thanks for taking the time to stop by, and leave a comment.
Yeah im aware of the split brain issues, and i have a third node on standby - however it is a interesting question in how much a third would actually help.
The biggest issue with split brain as i see it would be if the couchbase cluster would differentiate between internal and external communication - and the internal communication fails. ( rack backplane or LAN switch failure etc )
However in this case i'm using WAN connection between nodes, meaning that if the comminication breaks then it will brak for application and couchbase at the exact same time.. so theorhetically( it is extremely rare, but possible IRL of course, ISP's can have internal issues ) it should not occur that we get split clusters.

But how about the real worst case - all users of couchbase might be inetersted in this one: what if application server looses contact to the biggest quorum?
Will the couchbase client fail in that case or would it connect to the smaller cluster ( thus giving us split brain again )?

Reply
05/06/2015 4:31am

You have shared nice information about the topic. You can get more information here en.wikipedia.org/wiki/Cloud_computing

Reply
10/14/2015 10:12am

Working on music is not a child's play and need experience and effort. This post is giving us very precious information about doing this job. I m thankful for this beneficent post.

Reply
12/04/2015 1:22pm

1&1 is the worlds best domain, hosting & cloud server, providing company. it provide its services all over world. to get great Coupons, Offers, Deals of 1&1 visit our website on Webtechcoupons.com. Here you will get all latest offers which you want.

Reply
01/01/2016 8:19am

Thanks for share this informative post with us i am really delighted to discover this informative post.

Reply
02/19/2016 8:19am

Thanks for sharing your project ;)

Reply
02/19/2016 11:05am

Sky blue just capacities as reinforcement replication administration we can undoubtedly setup new servers and new records without worrying about what number of copies are availible and we don't pay the high data transmission costs for our ordinary use.

Reply
05/18/2016 9:18am

Thank you so much Love your blog..

Reply
05/18/2016 9:21am

Is very interesting article i know it does a lot more with a lot better performance but quick, dirty search is all I'm after at the moment.

Reply
06/13/2016 12:02pm

I like it, it's really cool! I enjoy reading this post and it's cool, that author decided to write about this question. I would like to read more about this topic. Author is really gorgeous.

Reply
08/01/2016 2:13pm

Cloud services become really popular so I dont see any reason to refuse.

Reply
10/09/2016 11:57am

Thanks for sharing. HA is definitely a need for every OLTP and even non OTLP business. You should probably check what InfoScale has to offer.

Reply
01/15/2017 12:04am

In the event that you are a non Indian occupant, you can apply for advances however those credits have high NRI advance loan costs. In any case, there are a few contrasts in the terms and conditions appropriate to them two. Lets have a point by point thought in regards to the terms of both with the goal that we can comprehend the distinction.

Reply
01/19/2017 12:02pm

Computer is world's number one invention, which recently invented, it's a latest technology, and nowadays everyone living in a world of technology.

Reply
02/01/2017 11:21am

This blog is really helpful to deliver updated educational affairs over internet which is really appraisable. I found one successful example of this truth through this blog. I am going to use such information now.

Reply
02/05/2017 8:59pm

Be watchful about the financing costs and different determinations of a credit. That will help you in producing the best outcomes monetarily. You can get advance on truly low rates in the event that you play safe, I mean pay your portions on time and on the off chance that you figure out how to pay it before time that would be more than great that will drag you in the line of good borrower, which will be exceptionally helpful if at any time in future you have to get the credit once more.

Reply
02/13/2017 8:14am

The perfect borrower conveys a FICO assessment over 700, he/she has resources in addition to an archived wage and he/she has cash for a propel installment. Individuals who don't meet these criteria must continue working harder to truly enhancing their likelihood of credit endorsement.

Reply
02/16/2017 11:17am

Actually, I am facing some difficulties to understand the meaning of the blog. If you have any short video film related to your blog, then I would request you to share here. It would be great help.

Reply
02/18/2017 8:14am

Well, actually nowadays computer is a part of our life, through a computer we can do a lot of things easily, it's world's number one technology, which recently invented.

Reply



Leave a Reply