Welcome to Collaction. Continue your journey in English.

Daily Course List 每日法庭排期表

為研究者提供法庭資料
Selina Cheng - 項目發起人

HK Court Lists Archive


A web application that automatically scrapes and archives Hong Kong court lists daily, the front-end application would offer the court list data as a searchable database, to the public for free, without restrictions of use.


What is the origin of this Project? 

Hong Kong has a very limited amount of open legal data. Currently, http://legalref.judiciary.gov.hk offers a very limited amount of judgement and court documents, such as High Court judgement. Other private, pay-walled services like D-Law exist, but data is patchy and expensive (>$100 per document order).


All court cases are otherwise recorded in the court lists, as soon as the cases enter the justice system, at the "Mention." Currently, the court lists are available for 7 days only: 3 days before and after the current day. There is no publicly available archive. 

Daily Course List 每日法庭排期表

As a matter of principal, justice could not exist without transparency. Open legal data is a crucial to a sound justice system.


What social problem are you trying to solve?

Journalists often learn of a case after that short window limit, such as from the GIS system, with limited information on the case. Once past the window, it would be impossible to search for the individual or the organization's name, case number, date, nature of charge. 


The web app would be very useful not only to journalists hoping to pursue a case, or research an individual or an organization's background. It would also be useful to due diligence professionals, legal professionals, and the public in general. 


How do we begin from scratch?


The Challenges we face?

  • There may be challenges from by government or organization's on violation of privacy (although the only private info would be the name.)
  • There may be government restriction on the use of legal data 
  • Long-term archive maintenance
  • Long-term server space, and possibly server maintenance
  • $


What to do:

  • Seek legal advice on privacy issues
  • Build a scraper, possibly with the help of existing open source tools, at fixed daily intervals
  • Build a database to store the data scraped
  • Build a front-end web application, with data entry points: search by parties' name, date, court, nature of charge, etc. (Ref: Pacer.gov) then offer a full list of data available.
  • Long-term database maintenance
  • Might need fundraising efforts to hire coders for longer-term development, and server space rental


Daily Course List 每日法庭排期表Daily Course List 每日法庭排期表


What resource do you need? 

python ==> scrapy

manual => frequency 

error > retry 

server space estimation, data compression 

SQL database for managing large datasets



crawl from different levels : e.g. 

morph.io

10,000 characters = 10kb per court per day

10,000 characters x 20 courts x 5 days x 50 weeks

= 50,000,000 characters

= approx 50 mb per year

computing speed as data accumulates in the long term?


 Progress

  1.  demo using morph.io (by Omar K.)
  • This is very preliminary prototype. It currently tackles only one of 29 court case hearing lists.
  • morph.io supports scraping the sites once per day, and provide download CSV and SQLite for further storage
  • TODO: extend the scrape.py to tackle all courts case hearing list and setup to permanent Database hosting server
  1. wget (by Kennon Wong)
  • ...
  1. ...



Who are we? 

Selina Cheng, reporter

selinakycheng@gmail.com 


Relevent links:

https://www.judiciary.hk/tc/crt_lists/daily_caulist.htm

https://www.d-law.com/ 



(於 更新)

其他人的評論

您認為這個社會創新項目可行嗎?

Daily Course List 每日法庭排期表 發起人

Collaction 拉近您和您最關心的社會話題

Daily Course List 每日法庭排期表 透過 Collaction 管理項目及尋找各路英雄的協助,在這裡您可以追蹤有趣的社會創新項目,並以您的力量一起協作更好的社會。

您也有好的故事及有趣計劃?加入 Collaction 介紹您的項目!

加入 Collaction 了解更多

追蹤我們

LANGUAGE / 語言

© 2014-2019 Made with by Collaction Team.