WO2004111870A1 - Method and system for barring access to selected internet resources - Google Patents

Method and system for barring access to selected internet resources Download PDF

Info

Publication number
WO2004111870A1
WO2004111870A1 PCT/AU2004/000791 AU2004000791W WO2004111870A1 WO 2004111870 A1 WO2004111870 A1 WO 2004111870A1 AU 2004000791 W AU2004000791 W AU 2004000791W WO 2004111870 A1 WO2004111870 A1 WO 2004111870A1
Authority
WO
WIPO (PCT)
Prior art keywords
interface device
access
users
resources
individual
Prior art date
Application number
PCT/AU2004/000791
Other languages
French (fr)
Inventor
James Freeman
Original Assignee
James Freeman
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by James Freeman filed Critical James Freeman
Publication of WO2004111870A1 publication Critical patent/WO2004111870A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation

Definitions

  • a method of and system for controlling user access to a public network such as the Internet and more particularly, to provide a consensual based network access system which is user friendly and able to be over-riden by authorised users in a relatively uncomplicated manner.
  • the Internet is a very large collection of computer based resources from around the world which are linked together. Each computer resource accessible via the Internet will have individual properties and content. These resources are identified by a URL ("Uniform Resource Locator") which specifies the precise location of the resource and the network transport protocol required to retrieve it.
  • URL Uniform Resource Locator
  • the Internet comprises a global computer network. Groups of local computers are linked together electronically into a local node network. These local network nodes are in turn linked to other networks and so on ad-infinitum.
  • the end result of this interconnection of computer networks is to create a vast array of resources which are accessible and available for public use.
  • the topological interlinking of these networks has lead to the use of the term WWW ("World Wide Web") becoming synonymous with the Internet.
  • WWW World Wide Web
  • HTML Hyper Text Markup Language
  • HTTP Hyper Text Transport Protocol
  • HTTPS HyperText Transfer Protocol
  • FTP File Transfer Protocol
  • Gopher Socks
  • NNTP Newsgroups
  • IRC Internet Relay Chat
  • the Internet is expanding on a daily basis and possibly at an exponential rate.
  • the resources available on the Internet are generally easily accessible by anyone with a computer that has Internet access. Readily (and often freely) available software such as
  • Web Browsers, FTP and News Clients allows easy access to the diversity of resources on offer on the Internet.
  • the white-list approach Three main devices or systems have in the past been used in an attempt to ensure that only appropriate material is viewed by selected users. These are the white-list approach, the blacklist approach and the word parse heuristic approach.
  • the first method has involved the generation of a so called “white list” or "yes list” which defines a list of permissible resources. Users are only able to access resources which have been specifically placed on that "white list”.
  • the disadvantage of this system is that the list is invariably out of date almost as soon as it has been compiled and also tends to be very incomplete. Also, generation of the white list is necessarily subjective and cannot be appropriate for all institutions or organisations which might choose to limit the access to the Internet via such a list system.
  • the second system of control which has been used in the past is to generate a so called "blacklist" or "no list".
  • a black list requires the generation of a list of resources that a particular organisation decides should not be accessed by its members.
  • generation of the black list is, by its very nature, subjective, and requires a supervisor or administrator to consider particular sites and make a decision on whether or not that site should be accessed by the members of the organisation. Because there are literally billions of web resources, a thorough review of even a small percentage of the available resources is beyond the means of any single individual or group. As a result the task of generating an inclusive blacklist is impossibly time consuming and the task continues to expand daily.
  • a further flaw of the black list system is that it cannot possibly locate all resources containing inappropriate material, even if the manpower was available to review this vast list. It is widely accepted that even the most sophisticated and largest search engines index perhaps about 50% of all resources on the Internet (although this is of course open to argument). Regardless, an administrator, who will have nothing like the resources of a multi-national search engine ente ⁇ rise, cannot hope to classify all inappropriate sites. With a blacklist those sites not black listed can, by default, be accessed.
  • black list cannot be appropriate for all members of an organisation. Once a particular site has been placed on the black list (for whatever reason) it is an administratively complex task to have that site removed from the list, particularly where some might feel the placement of site on the black list is appropriate, whereas others feel it should be removed. Who is right?
  • the third system for filtering out inappropriate resources is the so called word parse heuristic system.
  • word parse heuristic system When a resource is requested by a user, its content is first analysed using automated analysis software. The resource is only allowed to be accessed if it meets certain criteria. As a gross simplification if the analysis tool seeks to prevent access to pornographic material it may identify the key word "sex" as being significant.
  • Resources that contain this word may be blocked.
  • the fundamental flaw in this approach is that natural language recognition by computers, although improving slowly, is still relatively poor.
  • word parse based analysis tools often block resources which contain totally innocuous content whilst allowing access far more offensive resources.
  • Classic examples include the problems of the webmasters in "Middlesex" who often get blocked, or valid content such as resources dealing with Breast Cancer or Sex Education being banned. Additionally resources containing say pornographic images but no offensive key words often slip through such filters.
  • the invention provides a system in which an automated gateway is located between the individual end users and resources available on a computer network such as the Internet.
  • This gateway provides an access protocol, and this access protocol automatically varies over time, depending on the response of users who attempt to access specific resources that are potentially available via that gateway.
  • the invention provides an adaptive interface device for regulating the access by individual computer users to network resources containing both desirable and undesirable content, the adaptive interface device providing a gateway through which said individual computers will be connected to the network, said adaptive interface device comprising: storage means adapted to store data relating to individual resources accessible via said network, said stored data including at least a global accessibility index relating to said individual resources, said accessibility index adapted to have at least either a positive or negative value; access control means adapted to connect said individual computers to said network when said global accessibility index is positive or deny access when said global accessibility index is negative; and influence means accessible by at least selected authorised users of said individual computers for influencing said accessibility index either positively or negatively, depending on whether those users support or do not support access to said individual network resources.
  • the stored data to include subject matter categorisation which in use will be linked with individual resources accessible via said network, the device enabling individual users to assign a subject matter category to individual resources accessed via said interface device.
  • the access control means may be adapted to connect individual users to only those resources which fall into one or more subject matter categories, or may be adapted to deny individual users from accessing resources which fall into one or more subject matter categories.
  • the device may be adapted to keep a running total of the different subject matter categories assigned by different users to the same resource, and for the subject matter category which is associated with a particular resource to be that which, at any point in time, has received the highest number of assignations by individual users reviewing that resource.
  • Each resource may have a plurality of subject matter categories assigned to it if users determine that the resource covers more than one category. Further there is provided for the device to include resource analysis means adapted to analyse the categorisation of individual resources and to reject that categorisation where the categorisation by one or more users does not pass a predetermined threshold test.
  • the resource analysis means may include a probability analysis algorithm to determine whether a resource has been correctly classified.
  • the resource analysis means may be adapted to either negatively or positively list users who respectively correctly or incorrectly categorise resources. In the event that a user incorrectly categorises resources on one or more occasions the resource analysis means may be adapted to ignore further categorisations from that user, or notify an administrator of the system of that user's deliberate miss-categorisation. In addition, the system may negate previous categorisations by that user.
  • the probability analysis algorithm may be a Bayesian probability analysis algorithm.
  • the adaptive interface device to include override means which selected authorised users may access to override said access control means to thereby allow access or deny access of selected resources to individual users or groups of users over whom they are charged with responsibility.
  • the storage means may be adapted to keep a running total of the number of authorised users who choose to influence the accessibility index either positively or negatively, and to provide a positive accessibility index when certain criteria are met.
  • access may be allowed if the number of authorised users who influence the index positively exceeds the numbers of authorised users who influence the index negatively. Conversely access may be denied if the number of authorised users who influence the index negatively exceeds the numbers of authorised users who influence the index positively.
  • the invention extends to a system which includes an adaptive interface device for regulating access as above defined, and a plurality of individual computers, each of said individual computers being linked to said adaptive interface device via a local interface device, said local interface device being adapted to allow or deny access to any individual resource, irrespective of the global accessibility index for that resource.
  • Said local interface device may be located in the software of said individual computers or within the software of said adaptive interface device, and be operable using a password or similar control device.
  • Each individual computer may include a password controlled override facility which is adapted to access the override means in order to allow or deny access to individual resources, irrespective of the state of the global accessibility index for that resource.
  • the invention also provides for the system to include independent override means, which is operable independently of said individual computers to deny access to particular resources, or categories of resources, by at least some of said individual computers irrespective of the accessibility index for that resource, said independent override means not able to be overridden by said local interface device.
  • Figure 1 shows diagrammatically the manner in which an Internet access control system according to the invention may be used in an education environment.
  • the system described below is used in a teaching environment which may for example include all schools in a particular town or region. Of course, the system could be used in a number of other applications where some form of network access control is required. Access control might be required, for example, in a home environment, an office environment, or indeed any organisation or location where it is believed that free access to all resources on the Internet is not desirable.
  • the system described below in essence, has two levels of control inte ⁇ osed between individual users and the Internet, as well as an automated gateway which those charged with controlling access by individual users can influence.
  • the first level of control would typically be at a teacher level, the teacher being the person in direct contact with those pupils operating individual computers who wish to locate information on the Internet.
  • the second level of control would normally be provided by a senior administrator, such as a headmaster or senior staff member.
  • the system comprises the following main components.
  • the system comprises an adaptive interface device 10 which typically would be in the form of a proxy server, and might provide access to the Internet for a number of schools in a district or local area.
  • the adaptive interface device 10 would include monitoring software 12, a database 13, accessibility index software 14, and override software 16.
  • the adaptive interface device 10 will provide the link between individual schools 18 and the Internet 20. In essence, a school would only be able to access individual resources on the Internet
  • the accessibility index software 14 provides, in effect, a layered access control system as shown.
  • each school 18 would comprise a series of classrooms 22, each classroom being monitored by a teacher 24, and each teacher having a number of pupils in his or her class as indicated at numeral 26.
  • the pupils 26 each have an individual computer connected via connection lines 28 to the adaptive interface device 10.
  • the teacher 24 may also have a computer connected to said adaptive interface device 10. through a teacher controlled access line 30 to the adaptive interface device.
  • a resource is inaccessible it will be possible to immediately make the desired resource available by providing appropriate authorisation (following physical review and categorisation of said resource) to the adaptive interface device using means such as a password XX known by said teacher 24.
  • said password XX may be transient and allow a single authorisation cycle.
  • said password may allow multiple authorisations for a specified period without the need to re- enter said password.
  • Each school 18 would also have a senior administrator 32 linked to the adaptive interface device 10 via a link 34, the senior administrator having a role of either providing access to particular sites, irrespective of any teacher override, or denying access to any particular site, irrespective of any teacher override.
  • the administrator also has the ability to deny/allow access to entire categories of resource.
  • the teacher and administrator have password controlled layered access via software 16, as shown, to modify the access settings.
  • the adaptive interface device 10 would typically comprise a remote proxy server. This ensures that no local physical hardware management is required. It would be a simple thirty second change for a school to point their Internet access via router 40 at this proxy server (and for that matter a thirty second change to remove the access control if the system proves unsatisfactory). Once the school is connected in this way all Internet access for that school passes through the proxy server. The filtration is driven out of a very large database which forms part of the proxy server. This database holds what could be called a "grey list" (grey because it is neither a trae white list or black list).
  • the grey list is so called because the individual users who access the Internet via the adaptive interface device 10 will give most of the sites accessed either a positive or negative ranking. At its most basic if a particular site has more positive rankings than negative rankings it will be accessible. If the site has more negative rankings it will not be accessible, unless a global allow/deny facility in the interface device 10 is overridden.
  • the grey list may contain a table that looks like:
  • the table will have millions of resources classified in this way.
  • grey list is seeded as follows. All the black list sites (taken from the widely available black lists) are inserted with a votes for: 0; votes against: 1. Likewise all the white list sites (similarly sourced plus web-robot analysis of appropriate sites and portals) are manually reviewed and if deemed appropriate inserted with a votes for: 1; votes against: 0.
  • a given school entity directs all its Internet access through the remote proxy server. Each and every request to access a resource is analysed by the proxy server. First the school's custom list is consulted and access allowed/denied if there is a record indicating what the school specifically wants for a given resource. If there is no school record for the site then the global grey list is accessed. If a site is on the list and the votes for are greater than votes against, access is allowed. If votes against are greater than votes for access is denied. If the site is not on any list, access is also denied by default. If access is denied via the custom list and the local no over-ride option is set then that site is not accessible.
  • This page has the access denied message and the reason (local deny, category deny, global deny, unreviewed, etc) and offers the opportunity to review the site (and potentially make it accessible) there and then.
  • the site appears on automated analysis using a very strict word parse heuristic to contain totally innocuous material, the system can allow students to review it. This should work much better than a usual word parse heuristic because the system can be set to be overly strict about what is allowed. If the content does not pass the 'very innocuous' test then a username and password will be required to proceed (all teachers will have these). This is the validation stage. Note that the system will have the option to remember the username and password for the duration of a session.
  • sites will generally have an accessibility index development pattern which has the following features. If a site is allowed, and the consensus is that it is acceptable, there is no need to vote so the site will have a low vote count and tend to remain globally available, probably with a score like 2 for 1 against as there is no impetus to review allowed sites. Offensive sites are likely to accumulate votes against them as people click links, get denied access, review it, and then deny it (yet again). The more controversial sites will accumulate high vote counts, with narrow separation between the for and against votes.
  • the voting information, and the category information can be very valuable, and can be used as follows.
  • the vote counts represent the global opinion of the peer user group. It allows a school to set policies based on peer opinion. A strict school might prefer a strict policy, a more liberal one could set it in a more liberal fashion.
  • the vote counts and separation can be used as the determinants of how this is implemented.
  • a well accepted site might have a vote count like 2 for 1 against.
  • a globally agreed offensive site will have an ever increasing against vote count like 2 for 98 against as it is denied, reviewed, and voted against yet again.
  • a moderately controversial site might have a vote count 11 for/9 against, and a highly controversial one 101 for 101 against.
  • the category information that is input with the votes is also valuable as it will allow entities to ban sites such as 'Abortion Advocacy' or 'Sex Education' on say religious or moral grounds.
  • blacklist sites will no doubt include some good sites (it is currently about 5 million records long and gathered from many sources of varying quality) A large proportion will be correctly black listed, probably greater than 99%. For those that are not the correction will be simple and immediately available to the first reviewer that disagrees.
  • the seeding with blacklist sites is non-essential as all sites are denied by default. It simply helps to add a little momentum to the system. Additionally blacklist sites will not for instance ever be offered up for student review it this option has been enabled. No matter how innocuous a site may appear on a word parse only password authenticated users will be able to modify the permissions of already listed sites,
  • Each individual user entity (one school for arguments sake) is free to change what is accessible instantly with relatively trivial effort, (username/password once only if desired, then click three buttons). This makes the system as minimally intrusive as possible. If a teacher feels that access to a particular site is required for his or her pupils then the site can be made accessible immediately. The system is peer opinion driven. This is, access does not depend any one person's view of what students should see. It is the collective opinion of the very people to whom are already entrusted to the education of children - the teachers. A teacher can make a site available to their students as they see fit. That opinion is recorded. If the balance of opinion favours that site being available then it is made globally available. If the balance of opinion does not favour it then it is blocked.
  • Age brackets (primary/secondary/tertiary) are catered for with dedicated servers. Additionally a given entity is free to allow or deny anything they see fit with minimal effort or inconvenience.
  • the content of the site is downloaded and analysed by analysis software.
  • An algorithm that uses Bayesian probability analysis may be used to check for signature features that allow a fairly good automated guess about what that site is about. While such a system is not likely to analyse each and every resource with perfect accuracy, such an analysis technique should be highly accurate at detecting pornographic and gambling related sites (to name a few) as these have very characteristic signature elements.
  • Mathematically a system of this nature runs at a sensitivity (false negatives) and specificity (false positives) level of greater than 99%. If an algorithm flags a site in a particular category, that categorisation is thus likely to be correct to an accuracy of around 99% for some types of sites, particularly those with fairly typical characteristics, such as pornography and gambling sites.
  • Sex::Pornography category As a result the system responds as follows: (1) The site is denied for bad apple's entity and the 'no override' bit is set so bad apple can not change this (unless he/she is one of the very limited group that has administrator privileges for that site). The site is now no longer available even to users at bad apple's entity. It has never been available globally as it has never passed into the global database. In the worst case it was available to bad apple's entity for 1 day.
  • Bad apple gets an email from the system noting the transgression and is invited to either modify the category or lodge a 'your algorithm got it wrong' message. There is also a warning that their administrator has been notified.
  • the administrator for bad apple's entity may or may not actually be notified, depending on their settings and whether this is the first, second, or third transgression by bad apple.
  • the transgression count is important. If the probability of the algorithm recording a false positive is say 99% then a single transgression has an approximate 1 :100 statistical chance of being an error. However two transgressions and the odds of an error are about 1:10,000 and with three transgressions it becomes approximately a million to one. Note that if bad apple had flagged xxx.com as pom none of the above would have occurred. In fact they would have gained trust points. The number of bad apples will be small to start with. After they try to misclassify a site and see that the system is watching them it will shrink further. If they say continue to say allow porn sites in inappropriate categories then they will have peer pressure applied (in the form of their entity's administrator) and their input will simply be ignored outside of their local entity.

Abstract

An adaptive interface device (10) for regulating the access by computers users (26) to network (20) resources which contain desirable and undesirable content. The device (10) provides a gateway through which computers will be connected to the network (20). The device (10) comprising storage means that store data relating to individual resources, the stored data including a global accessibility index to have either a positive or negative value. The device (10) further includes access control means adapted to connect the computers to the network (20) when the index is positive or deny access when the index is negative. The device (10) includes influencing means accessible by selected authorised users (24) for influencing the index either positively or negatively, depending on whether those users (24) support or do not support access to the resources. The device (10) may be used to provide an interactive, user controlled system for permitting or denying user's (26) access to web sites, depending on peer values.

Description

Method and system for barring access to selected Internet Resources
Field of the invention
A method of and system for controlling user access to a public network such as the Internet, and more particularly, to provide a consensual based network access system which is user friendly and able to be over-riden by authorised users in a relatively uncomplicated manner.
Background of the invention The Internet is a very large collection of computer based resources from around the world which are linked together. Each computer resource accessible via the Internet will have individual properties and content. These resources are identified by a URL ("Uniform Resource Locator") which specifies the precise location of the resource and the network transport protocol required to retrieve it.
At its heart the Internet comprises a global computer network. Groups of local computers are linked together electronically into a local node network. These local network nodes are in turn linked to other networks and so on ad-infinitum. The end result of this interconnection of computer networks is to create a vast array of resources which are accessible and available for public use. The topological interlinking of these networks has lead to the use of the term WWW ("World Wide Web") becoming synonymous with the Internet. Although purists might argue that the WWW technically comprises documents written in a mark up language such as HTML ("Hyper Text Markup Language") and available for public use via HTTP ("Hyper Text Transport Protocol") the term WWW and Internet have become synonymous in the minds of many. In addition to the well known HTTP, protocols such as HTTPS ("HTTP Secure"), FTP ("File Transfer Protocol") Gopher, Socks, and NNTP ("Newsgroups") are also commonly used to fetch resources. New protocols include IRC ("Internet Relay Chat") and related instant messaging services as well as a variety of Napster derived file-sharing services to name just a few. There are many formats in which resources may be retrieved. Some of the more common data formats include TXT, HTML, XML, PDF, SWF, GIF, JPEG, PNG, MPEG, AVI, MP3, ZP, TAR, TAR, GZ which are text, graphics, movie, music, and archive formats respectively. The Internet is expanding on a daily basis and possibly at an exponential rate. The resources available on the Internet are generally easily accessible by anyone with a computer that has Internet access. Readily (and often freely) available software such as
Web Browsers, FTP and News Clients allows easy access to the diversity of resources on offer on the Internet.
In many situations it is desirable to limit the amount and type of information that certain individuals are permitted to view or retrieve. For example, within an education environment, it is usually undesirable or inappropriate for students to view pornographic or violent content whilst using the Internet. On the other hand, attempting to restrict access to large sections of the Internet will invariably result in appropriate or useful materials not being accessible. As the situation currently stands there is an inability to precisely target that material which is allowed to be viewed, and that material which should be restricted. Because the Internet is an ever expanding resource, and because it is possible for virtually anyone to place content upon it, attempts to filter the material which is accessible to a defined user group tend to become inaccurate and un-useful relatively quickly.
Three main devices or systems have in the past been used in an attempt to ensure that only appropriate material is viewed by selected users. These are the white-list approach, the blacklist approach and the word parse heuristic approach. The first method has involved the generation of a so called "white list" or "yes list" which defines a list of permissible resources. Users are only able to access resources which have been specifically placed on that "white list". The disadvantage of this system is that the list is invariably out of date almost as soon as it has been compiled and also tends to be very incomplete. Also, generation of the white list is necessarily subjective and cannot be appropriate for all institutions or organisations which might choose to limit the access to the Internet via such a list system.
For example, a particular resource which might be inappropriate for primary school children to view could well be an important resource for high school students. In addition, some communities might find a particular site quite appropriate for their scholars to have access to, whereas another community might consider the site to be offensive for, say, religious or racial reasons. Accordingly, white list type control systems tend to be time consuming to update and in many instances defeat the advantages of having access to the diversity of the Internet.
The second system of control which has been used in the past is to generate a so called "blacklist" or "no list". A black list requires the generation of a list of resources that a particular organisation decides should not be accessed by its members. As for white lists, generation of the black list is, by its very nature, subjective, and requires a supervisor or administrator to consider particular sites and make a decision on whether or not that site should be accessed by the members of the organisation. Because there are literally billions of web resources, a thorough review of even a small percentage of the available resources is beyond the means of any single individual or group. As a result the task of generating an inclusive blacklist is impossibly time consuming and the task continues to expand daily.
A further flaw of the black list system is that it cannot possibly locate all resources containing inappropriate material, even if the manpower was available to review this vast list. It is widely accepted that even the most sophisticated and largest search engines index perhaps about 50% of all resources on the Internet (although this is of course open to argument). Regardless, an administrator, who will have nothing like the resources of a multi-national search engine enteφrise, cannot hope to classify all inappropriate sites. With a blacklist those sites not black listed can, by default, be accessed.
Yet another disadvantage of having a black list is that such a list cannot be appropriate for all members of an organisation. Once a particular site has been placed on the black list (for whatever reason) it is an administratively complex task to have that site removed from the list, particularly where some might feel the placement of site on the black list is appropriate, whereas others feel it should be removed. Who is right?
The third system for filtering out inappropriate resources is the so called word parse heuristic system. When a resource is requested by a user, its content is first analysed using automated analysis software. The resource is only allowed to be accessed if it meets certain criteria. As a gross simplification if the analysis tool seeks to prevent access to pornographic material it may identify the key word "sex" as being significant.
Resources that contain this word (generally in combination with other key words) may be blocked. The fundamental flaw in this approach is that natural language recognition by computers, although improving slowly, is still relatively poor. Typically word parse based analysis tools often block resources which contain totally innocuous content whilst allowing access far more offensive resources. Classic examples include the problems of the webmasters in "Middlesex" who often get blocked, or valid content such as resources dealing with Breast Cancer or Sex Education being banned. Additionally resources containing say pornographic images but no offensive key words often slip through such filters.
Of course there are various different types of filtration software or filtration techniques which utilise some combination of the above three systems. However, even when the filtration algorithm is reasonably sophisticated, the prior art filtration techniques do not accommodate the situation where one particular user or administrator considers a particular site should be barred, whereas another user or user group considers that the particular site should be accessed. Thus for example, in a particular school, primary school teachers might consider some resources to be totally inappropriate for their children to be viewing, whereas in a classroom of high school children in another section of the same school, the teacher might consider that same site to be required viewing. Prior art filter systems typically do not allow overriding of blocked sites without the inconvenience and time wasting associated with contacting the person in the organisation with the blocking authority.
Such problems frustrate those wishing to use the Internet for appropriate puφoses yet provide some degree of protection from offensive content. It is not uncommon for institutions to institute filtration and then remove it due to some of the issues noted above. Summary of the invention
In broad concept the invention provides a system in which an automated gateway is located between the individual end users and resources available on a computer network such as the Internet. This gateway provides an access protocol, and this access protocol automatically varies over time, depending on the response of users who attempt to access specific resources that are potentially available via that gateway. More particularly, the invention provides an adaptive interface device for regulating the access by individual computer users to network resources containing both desirable and undesirable content, the adaptive interface device providing a gateway through which said individual computers will be connected to the network, said adaptive interface device comprising: storage means adapted to store data relating to individual resources accessible via said network, said stored data including at least a global accessibility index relating to said individual resources, said accessibility index adapted to have at least either a positive or negative value; access control means adapted to connect said individual computers to said network when said global accessibility index is positive or deny access when said global accessibility index is negative; and influence means accessible by at least selected authorised users of said individual computers for influencing said accessibility index either positively or negatively, depending on whether those users support or do not support access to said individual network resources.
Further there is provided for the stored data to include subject matter categorisation which in use will be linked with individual resources accessible via said network, the device enabling individual users to assign a subject matter category to individual resources accessed via said interface device. The access control means may be adapted to connect individual users to only those resources which fall into one or more subject matter categories, or may be adapted to deny individual users from accessing resources which fall into one or more subject matter categories. The device may be adapted to keep a running total of the different subject matter categories assigned by different users to the same resource, and for the subject matter category which is associated with a particular resource to be that which, at any point in time, has received the highest number of assignations by individual users reviewing that resource. Each resource may have a plurality of subject matter categories assigned to it if users determine that the resource covers more than one category. Further there is provided for the device to include resource analysis means adapted to analyse the categorisation of individual resources and to reject that categorisation where the categorisation by one or more users does not pass a predetermined threshold test.
The resource analysis means may include a probability analysis algorithm to determine whether a resource has been correctly classified. The resource analysis means may be adapted to either negatively or positively list users who respectively correctly or incorrectly categorise resources. In the event that a user incorrectly categorises resources on one or more occasions the resource analysis means may be adapted to ignore further categorisations from that user, or notify an administrator of the system of that user's deliberate miss-categorisation. In addition, the system may negate previous categorisations by that user.
The probability analysis algorithm may be a Bayesian probability analysis algorithm.
Further there is provided for the adaptive interface device to include override means which selected authorised users may access to override said access control means to thereby allow access or deny access of selected resources to individual users or groups of users over whom they are charged with responsibility.
The storage means may be adapted to keep a running total of the number of authorised users who choose to influence the accessibility index either positively or negatively, and to provide a positive accessibility index when certain criteria are met. In a preferred form of the invention, access may be allowed if the number of authorised users who influence the index positively exceeds the numbers of authorised users who influence the index negatively. Conversely access may be denied if the number of authorised users who influence the index negatively exceeds the numbers of authorised users who influence the index positively. The invention extends to a system which includes an adaptive interface device for regulating access as above defined, and a plurality of individual computers, each of said individual computers being linked to said adaptive interface device via a local interface device, said local interface device being adapted to allow or deny access to any individual resource, irrespective of the global accessibility index for that resource. Said local interface device may be located in the software of said individual computers or within the software of said adaptive interface device, and be operable using a password or similar control device.
Each individual computer may include a password controlled override facility which is adapted to access the override means in order to allow or deny access to individual resources, irrespective of the state of the global accessibility index for that resource.
The invention also provides for the system to include independent override means, which is operable independently of said individual computers to deny access to particular resources, or categories of resources, by at least some of said individual computers irrespective of the accessibility index for that resource, said independent override means not able to be overridden by said local interface device.
These and further features of the invention will be made apparent from the description of an embodiment thereof given below by way of example. In the description reference is made to the accompanying drawings, but the specific features shown in the drawings should not be construed as limiting on the invention.
Brief description of the drawings
Figure 1 shows diagrammatically the manner in which an Internet access control system according to the invention may be used in an education environment. Detailed description of the embodiments The system described below is used in a teaching environment which may for example include all schools in a particular town or region. Of course, the system could be used in a number of other applications where some form of network access control is required. Access control might be required, for example, in a home environment, an office environment, or indeed any organisation or location where it is believed that free access to all resources on the Internet is not desirable.
The system described below, in essence, has two levels of control inteφosed between individual users and the Internet, as well as an automated gateway which those charged with controlling access by individual users can influence. In a teaching environment the first level of control would typically be at a teacher level, the teacher being the person in direct contact with those pupils operating individual computers who wish to locate information on the Internet. The second level of control would normally be provided by a senior administrator, such as a headmaster or senior staff member.
As shown in figure 1 of the drawing the system comprises the following main components. Firstly, the system comprises an adaptive interface device 10 which typically would be in the form of a proxy server, and might provide access to the Internet for a number of schools in a district or local area.
The adaptive interface device 10 would include monitoring software 12, a database 13, accessibility index software 14, and override software 16. The adaptive interface device 10 will provide the link between individual schools 18 and the Internet 20. In essence, a school would only be able to access individual resources on the Internet
20 if approved by the adaptive interface device 10. The accessibility index software 14 provides, in effect, a layered access control system as shown.
Typically each school 18 would comprise a series of classrooms 22, each classroom being monitored by a teacher 24, and each teacher having a number of pupils in his or her class as indicated at numeral 26. The pupils 26 each have an individual computer connected via connection lines 28 to the adaptive interface device 10. The teacher 24 may also have a computer connected to said adaptive interface device 10. through a teacher controlled access line 30 to the adaptive interface device. In the event that a resource is inaccessible it will be possible to immediately make the desired resource available by providing appropriate authorisation (following physical review and categorisation of said resource) to the adaptive interface device using means such as a password XX known by said teacher 24. On said student computers said password XX may be transient and allow a single authorisation cycle. On said teacher computer said password may allow multiple authorisations for a specified period without the need to re- enter said password. Some resources may be deemed suitable for student review (based on strict word parse heuristics) and allow authorisation by users without said password.
Each school 18 would also have a senior administrator 32 linked to the adaptive interface device 10 via a link 34, the senior administrator having a role of either providing access to particular sites, irrespective of any teacher override, or denying access to any particular site, irrespective of any teacher override. The administrator also has the ability to deny/allow access to entire categories of resource. The teacher and administrator have password controlled layered access via software 16, as shown, to modify the access settings.
In practical terms it is envisaged that the system would operate is as follows.
The adaptive interface device 10 would typically comprise a remote proxy server. This ensures that no local physical hardware management is required. It would be a simple thirty second change for a school to point their Internet access via router 40 at this proxy server (and for that matter a thirty second change to remove the access control if the system proves unsatisfactory). Once the school is connected in this way all Internet access for that school passes through the proxy server. The filtration is driven out of a very large database which forms part of the proxy server. This database holds what could be called a "grey list" (grey because it is neither a trae white list or black list).
The grey list is so called because the individual users who access the Internet via the adaptive interface device 10 will give most of the sites accessed either a positive or negative ranking. At its most basic if a particular site has more positive rankings than negative rankings it will be accessible. If the site has more negative rankings it will not be accessible, unless a global allow/deny facility in the interface device 10 is overridden.
This positive or negative ranking for particular sites is in effect an "accessibility index". Of course, different types of indexes could be implemented, but ultimately what is required is that those who are tasked with the role of supervising access to particular resources either approve or disapprove of access being given, and that approval or disapproval decision translates into an accessibility system. An example of how that might operate in practice follows from the examples set out below.
The grey list may contain a table that looks like:
Figure imgf000011_0001
Of course, the table will have millions of resources classified in this way. Initially the grey list is seeded as follows. All the black list sites (taken from the widely available black lists) are inserted with a votes for: 0; votes against: 1. Likewise all the white list sites (similarly sourced plus web-robot analysis of appropriate sites and portals) are manually reviewed and if deemed appropriate inserted with a votes for: 1; votes against: 0.
In addition to this a school will have its own custom list which might have the something like the following format:
Figure imgf000012_0001
A given school entity directs all its Internet access through the remote proxy server. Each and every request to access a resource is analysed by the proxy server. First the school's custom list is consulted and access allowed/denied if there is a record indicating what the school specifically wants for a given resource. If there is no school record for the site then the global grey list is accessed. If a site is on the list and the votes for are greater than votes against, access is allowed. If votes against are greater than votes for access is denied. If the site is not on any list, access is also denied by default. If access is denied via the custom list and the local no over-ride option is set then that site is not accessible. Typically only the headmaster or a small trasted group in a school would have the ability to set the "no over-ride" option. This allows a school to set whatever policy in place they like that can not be changed (by teachers/students) as is described below. Nor will such preferences be altered by changes to the global grey list. Assuming access has been denied (but override is allowed) access is provided as follows. Rather than the usual "Access Denied, Please contact your Administrator" web page, which those familiar with white list or black list systems are familiar, the system presents a dynamic page, at the user computer who requested the resources. This page has the access denied message and the reason (local deny, category deny, global deny, unreviewed, etc) and offers the opportunity to review the site (and potentially make it accessible) there and then. Optionally, if the site appears on automated analysis using a very strict word parse heuristic to contain totally innocuous material, the system can allow students to review it. This should work much better than a usual word parse heuristic because the system can be set to be overly strict about what is allowed. If the content does not pass the 'very innocuous' test then a username and password will be required to proceed (all teachers will have these). This is the validation stage. Note that the system will have the option to remember the username and password for the duration of a session. This is important to ensure the system is user friendly. The option to remember a password means that a teacher at their computer will be asked for their password only once but if they are at a student machine they can authorise a site without then allowing the student to act with the teacher's permissions. Once a teacher has been validated they will skip the password requirement for the duration of their current session. The authority ends when they close the browser, log off or after 1 hour - whichever is the sooner. This helps to minimise the inconvenience of the end users. Once validation has occurred the resource is presented with a top bar containing two buttons ( Allow and Deny ), a drop down classification list to select from, and a short text field for comments. Below this the actual resource is displayed. If the validated user selects "Allow" then one yes vote is added to the global grey list Notes for' count linked with that resource. Likewise if the user selects "Deny" then one no vote is added to the global grey list Notes against' count. The custom school list will instantly reflect this decision, thus if it is decided that a site should be available, it will be immediately available. This is a key issue as it also helps to make the filtration as non-invasive as possible. The less intrusive and inconvenient a system is the more likely it is to gain wide acceptance. To be accessible a resource needs a positive accessibility index (votes for > votes against). By default the system will handle the situation where the votes for and against on the grey list are tied by denying access. Consider a resource that was seeded into the database from a blacklist. This resource was initially assigned a Note against' of 1 and a Notes for' of 0. As a result this resource will initially be inaccessible. If a teacher at school A decided to allow this site then it will have 1 vote for and 1 against. The school's custom list will ensure that this site is accessible to school A so they will not now need to vote again. However, other schools still influence the accessibility index for the site.
Let us consider the site dubious.net with 101 votes for and 101 against. How did it get that way? Let us presume it was on a black list and thus entered the system with a seed value of 1 vote against as above. School X decides it is OK so it gets added to their local access list and school X is now able to access the site freely. It now has 1 vote for 1 against. It is still inaccessible to other schools. School Y reviews it and finds it offensive (2 against, 1 for). Still inaccessible. School Z reviews it and finds it acceptable (2 all). School A finds it acceptable (3 for, 2 against). At this point all schools have access. School B sees students at dubious.net and decides they don't want that. They use their administration interface (all teachers with a username and password can access this) to put a local ban on the site. Now the votes are again tied at 3 all so the site drop out of global availability.
Thus, sites will generally have an accessibility index development pattern which has the following features. If a site is allowed, and the consensus is that it is acceptable, there is no need to vote so the site will have a low vote count and tend to remain globally available, probably with a score like 2 for 1 against as there is no impetus to review allowed sites. Offensive sites are likely to accumulate votes against them as people click links, get denied access, review it, and then deny it (yet again). The more controversial sites will accumulate high vote counts, with narrow separation between the for and against votes.
Note that as a requirement of submitting an Allow or Deny vote users will be required to select a category for that site from a list of about 50 such categories. The voting information, and the category information, can be very valuable, and can be used as follows.
The vote counts (for/against) represent the global opinion of the peer user group. It allows a school to set policies based on peer opinion. A strict school might prefer a strict policy, a more liberal one could set it in a more liberal fashion. The vote counts and separation can be used as the determinants of how this is implemented. A well accepted site might have a vote count like 2 for 1 against. A globally agreed offensive site will have an ever increasing against vote count like 2 for 98 against as it is denied, reviewed, and voted against yet again. A moderately controversial site might have a vote count 11 for/9 against, and a highly controversial one 101 for 101 against.
Mathematically the vote counts can be processed to provide useful results as follows: controversy = 100* ( ( for+against ) /max) * ( 1/ ( 1+abs (for- against) ) ) **2 weight = 100* ( ( for-against ) /max) * ( 1/controversy) where max is the current highest vote count on the global list (this is used to normalize the data as the overall number of votes increase) .
The manner in which these two formulae can be used to provide useful information can best be explained using examples, containing a range of possible numbers as follows: for/against : 1/0 controversy = 100 * ( (1+0) / 200 ) * ( 1 / (1+abs (1- 0)) )**2 controversy = 0.125% weighted +/- = 100 * ( (1-0) / 200 ) * ( 1 / 0.125 ) weighted +/- = 4.00000
for/against : 0/1 controversy = 100 * ( (0+1) / 200 ) * ( 1 / (1+abs (0- D) )**2 controversy = 0.125% weighted +/- = 100 * ( (0-1) / 200 ) * ( 1 / 0.125 ) weighted +/- = -4.00000
for/against : 2/1 controversy = 100 * ( (2+1) / 200 ) * ( 1 / (1+abs (2- 1)) )**2 controversy = 0.375% weighted +/- = 100 * ( (2-1) / 200 ) * ( 1 / 0.375 ) weighted +/- = 1.33333 for/against : 1/2 controversy = 100 * ( (1+2) / 200 ) * ( 1 / (1+abs (1- 2)) )**2 controversy = 0.375% weighted +/- = 100 * ( (1-2) / 200 ) * ( 1 / 0.375 ) weighted +/- = -1.33333
for/against : 9/10 controversy = 100 * ( (9+10) / 200 ) * ( 1 / (1+abs (9- 10)) )**2 controversy = 2.375% weighted +/- = 100 * ( (9-10) / 200 ) * ( 1 / 2.375 ) weighted +/- = -0.21053
for/against : 10/9 controversy = 100 * ( (10+9) / 200 ) * ( 1 / (l+abs(10-9) ) )**2 controversy = 2.375% weighted +/- = 100 * ( (10-9) / 200 ) * ( 1 / 2.375 ) weighted +/- = 0.21053
for/against : 49/50 controversy = 100 * ( (49+50) / 200 ) * ( 1 / (l+abs(49-50) ) )**2 controversy = 12.375% weighted +/- = 100 * ( (49-50) / 200 ) * ( 1 / 12.375) weighted +/- = -0.04040
for/against : 50/49 controversy = 100 * ( (50+49) / 200 ) * ( 1 / (1+abs (50-49) ) )**2 controversy = 12.375% weighted +/- = 100 * ( (50-49) / 200 ) * ( 1 / 12.375) weighted +/- = 0.04040 for/against : 99/100 controversy = 100 * ( (99+100) / 200 ) * ( 1 / (l+abs(99-100) ) **2 controversy = 24.875% weighted +/- = 100 * ( (99-100) / 200 ) * ( 1 / 24.875 ) weighted +/- = -0.02010
for/against : 100/99 controversy = 100 * ( (100+99) / 200 ) * ( 1 / (l+abs(100-99) ) **2 controversy = 24.875% weighted +/- = 100 * ( (100-99) / 200 ) * ( 1 / 24.875) weighted +/- = 0.02010
for/against : 45/50 controversy = 100 * ( (45+50) / 200 ) * ( 1 / (l+abs(45-50) ) ) **2 controversy = 1.319% weighted +/- = 100 * ( (45-50) / 200 ) * ( 1 / 1.319 ) weighted +/- = -1.89474
for/against : 2/198 controversy = 100 * ( (2+198) / 200 ) * ( 1 / (l+abs(2-198) ) ) **2 controversy = 0.002% weighted +/- = 100 * ( (2-198) / 200 ) * ( 1 / 0.002 ) weighted +/- = -38032
Note that these formulae, while indicative of information that can be developed from the raw data, may not accurately reflect what works best in practice. NOT FURNISHED UPON FILING
NOT FURNISHED UPON FILING
NOT FURNISHED UPON FILING
NOT FURNISHED UPON FILING
global category classification will be government as this is the consensus view. In the event of a tied vote on the category the first nominated category will be used.
An alternative approach will be for each resource to be entitled to have a plurality of subject matter categories assignable to it, which will make the system more dynamic, and allow some particularly wide scoped resources to be accessible in a number of different subject matter categories. Thus for example, a university medical school website would be appropriately classified in both medicine and education, and would be accessible to users who have access to either or both subject matter categories.
With this information in hand it is possible for a given user supervisor or administrator to select allowed categories of Internet resources in a similar way to the way one might choose a cable TV channel package. This allows the global list to be utilized by clients in a very tailored manner. For example a legal firm might wish to start their service with obviously relevant material in categories such as: search engines education reference business law commerce legislation news/media government property etc. By allowing only specific categories the administrator has effectively selected the available resources appropriate to a particular business model. If resources from other categories are wanted they are simply approved in the usual (per resource) manner. Effectively the global database subject matter category information forms a second level of selectivity (as well as the votes information) to tailor internet access. Using the library analogy you can use subject matter categories to select shelves, articles, magazines or books with which to stock the virtual internet library available through the system. The categorisation is dynamic and can respond to changing content user opinion as required. In practical terms it is likely that there will be three independent databases serving primary, secondary and tertiary students as the needs of these institutions are quite different. This is quite simple to implement. Clearly other institutions, such as coφorate offices, might set completely different parameters and can be catered for with a separate database. The system has the potential to offer great benefit where there is a common group of peers with roughly similar preferences as although individually they contribute only a little to the global database together they can review far more resources than individually.
The category information that is input with the votes is also valuable as it will allow entities to ban sites such as 'Abortion Advocacy' or 'Sex Education' on say religious or moral grounds.
Of necessity we have to seed the database with data. The white-listing will however be limited, and based on complete human review so there should only be controversial sites - rather than outright bad sites included. The top 10,000 sites in the world (ranked by popularity rather than useful content) will be included to start so that many of the sites that a given user base wishes to access will already be classified. The black list seeding will no doubt include some good sites (it is currently about 5 million records long and gathered from many sources of varying quality) A large proportion will be correctly black listed, probably greater than 99%. For those that are not the correction will be simple and immediately available to the first reviewer that disagrees. The seeding with blacklist sites is non-essential as all sites are denied by default. It simply helps to add a little momentum to the system. Additionally blacklist sites will not for instance ever be offered up for student review it this option has been enabled. No matter how innocuous a site may appear on a word parse only password authenticated users will be able to modify the permissions of already listed sites,
(a) Summary
Each individual user entity (one school for arguments sake) is free to change what is accessible instantly with relatively trivial effort, (username/password once only if desired, then click three buttons). This makes the system as minimally intrusive as possible. If a teacher feels that access to a particular site is required for his or her pupils then the site can be made accessible immediately. The system is peer opinion driven. This is, access does not depend any one person's view of what students should see. It is the collective opinion of the very people to whom are already entrusted to the education of children - the teachers. A teacher can make a site available to their students as they see fit. That opinion is recorded. If the balance of opinion favours that site being available then it is made globally available. If the balance of opinion does not favour it then it is blocked.
It is relatively simple to tune the system to suit the differing requirements of different schools using the controversy and weightings described above. Age brackets (primary/secondary/tertiary) are catered for with dedicated servers. Additionally a given entity is free to allow or deny anything they see fit with minimal effort or inconvenience.
From the initial seeding it is likely a very large resource will grow. In a single school with say 1000 students and 50 teachers you might expect 4 votes per day per teacher and perhaps an additional 1 vote per student per day (for rigorously screened innocuous content). This is 1200 votes per day, so say 5000 per week, 20,000 per month, 240,000 per year. Each individual contributes very little, and as a direct result suffers minimal inconvenience. However the sum result of their efforts is a very significant human review of the Internet. It is not beyond the realms of possibility that this might even keep pace with the ever expanding plethora of resources.
In a perfect world it would be quite possible to listen to a users voting and category preferences and simply act on them. However on occasions some users might try to misclassify sites. A small proportion of users will have alternative agendas and it is important that an access control facility is able to overcome the problem of isolated users who puφosely misclassify sites in order to further their own agendas. We can refer to such users as "bad apples". While it is safe to assume that most of the users can be trasted, most of the time, the system can be enhanced to deal with bad apples? In broad terms, the system can assign a trust value to those who classify resources accessible via the interface device. The system would work substantially as follows.
By default any given user is not trusted. What this means is that the system will not arbitrarily accept any particular user's input about the (particularly) category they suggest for a resource. Consider a bad apple who decides to place xxx.com into Sex::Education::General or some other allowed category. How is the system to deal with this? If this vote was simply registered and the site had never been reviewed before it would become available to all the other user entities as well as the entity to which the bad apple user belonged. Clearly this is unacceptable. It is proposed that whenever a new site (previously unreviewed) attracts review attention the following process comes into play.
The content of the site is downloaded and analysed by analysis software. An algorithm that uses Bayesian probability analysis may be used to check for signature features that allow a fairly good automated guess about what that site is about. While such a system is not likely to analyse each and every resource with perfect accuracy, such an analysis technique should be highly accurate at detecting pornographic and gambling related sites (to name a few) as these have very characteristic signature elements. Mathematically a system of this nature runs at a sensitivity (false negatives) and specificity (false positives) level of greater than 99%. If an algorithm flags a site in a particular category, that categorisation is thus likely to be correct to an accuracy of around 99% for some types of sites, particularly those with fairly typical characteristics, such as pornography and gambling sites.
With this information is possible to build up a trust profile on each and every user who has the ability to influence the system. Before their categorisation input (opinion) will influence the global database they will need to prove resource analysis means that they are categorising accurately. This means putting porn in porn, gambling in gambling, etc. For example in the case mentioned above, bad apple elects to place xxx.com into the Sex::Education::General category rather than Sex::Harcore Pornography category. If the Sex::Education::General category is allowed in bad apple's entity then that site is instantly accessible to all the users within that entity. This is of course undesirable. However each 24 hours a global database update occurs. At this time the system notes that
(1) bad apple is not yet a trasted user so it must
(2) download xxx.com and ran the algorithm.
As noted the probability of the algorithm flagging xxx.com as porn are greater than 99%. The site is flagged as pom and it is noted that it has not been assigned to the
Sex::Pornography category. As a result the system responds as follows: (1) The site is denied for bad apple's entity and the 'no override' bit is set so bad apple can not change this (unless he/she is one of the very limited group that has administrator privileges for that site). The site is now no longer available even to users at bad apple's entity. It has never been available globally as it has never passed into the global database. In the worst case it was available to bad apple's entity for 1 day.
(2) Bad apple's trust rating is negatively influenced.
(3) Bad apple gets an email from the system noting the transgression and is invited to either modify the category or lodge a 'your algorithm got it wrong' message. There is also a warning that their administrator has been notified. (4) The administrator for bad apple's entity may or may not actually be notified, depending on their settings and whether this is the first, second, or third transgression by bad apple.
The transgression count is important. If the probability of the algorithm recording a false positive is say 99% then a single transgression has an approximate 1 :100 statistical chance of being an error. However two transgressions and the odds of an error are about 1:10,000 and with three transgressions it becomes approximately a million to one. Note that if bad apple had flagged xxx.com as pom none of the above would have occurred. In fact they would have gained trust points. The number of bad apples will be small to start with. After they try to misclassify a site and see that the system is watching them it will shrink further. If they say continue to say allow porn sites in inappropriate categories then they will have peer pressure applied (in the form of their entity's administrator) and their input will simply be ignored outside of their local entity. In this way the influence of bad apples on the global database will be minimal. It is also possible to negate the influence of a previously trasted bad apple on the global database. This is because the system keeps a record of which specific user(s) added input to a given global database entry. If a bad apple (who was trusted) is detected, his or her opinions can be automatically removed from the global database.
It will be understood that the invention disclosed and defined herein extends to all alternative combinations of two or more of the individual features mentioned or evident from the text or drawings. All of these different combinations constitute various alternative aspects of the invention. The foregoing describes embodiments of the present invention and modifications, obvious to those skilled in the art can be made thereto, without departing from the scope of the present invention.

Claims

Claims
1 A system for regulating access by individual computer users to a network containing both desirable and undesirable content in which an automated gateway is located between the individual end users and resources available on a computer network, the gateway including an access protocol which automatically varies over time, depending on the response of users who attempt to access specific resources that are potentially available via that gateway.
2 An adaptive interface device for regulating the access by individual computer users to network resources containing both desirable and undesirable content, the adaptive interface device providing a gateway through which said individual computers will be connected to the network, said adaptive interface device comprising: storage means adapted to store data relating to individual resources accessible via said network, said stored data including at least a global accessibility index relating to said individual resources, said accessibility index adapted to have at least either a positive or negative value; access control means adapted to connect said individual computers to said network when said global accessibility index is positive or deny access when said global accessibility index is negative; and influence means accessible by at least selected authorised users of said individual computers for influencing said accessibility index either positively or negatively, depending on whether those users support or do not support access to said individual network resources.
3 An adaptive interface device according to claim 2 wherein the stored data includes subject matter categorisation which in use will be linked with individual resources accessible via said network, the device enabling individual users to assign a subject matter category to individual resources accessed via said interface device.
4 An adaptive interface device according to claim 2 or 3 wherein the access control means is adapted to connect individual users to only those resources which fall into one or more subject matter categories. 5 An adaptive interface device according to claim 2 or 3 wherein the access control is adapted to deny individual users from accessing resources which fall into one or more subject matter categories.
6 An adaptive interface device according to any one of claims 2 to 6 wherein the device is adapted to keep a running total of the different subject matter categories assigned by different users to the same resource, and for the subject matter category which is associated with a particular resource to be that which, at any point in time, has received the highest number of assignations by individual users reviewing that resource.
7 An adaptive interface device according to any one of claims 2 to 6 wherein each resource has a plurality of subject matter categories assigned to it if users determine that the resource covers more than one category.
8 An adaptive interface device according to any one of claims 2 to 7 including resource analysis means adapted to analyse the categorisation of individual resources and reject that categorisation where the categorisation by one or more users does not pass a predetermined threshold test.
9 An adaptive interface device according to claim 8 wherein the resource analysis means includes a probability analysis algorithm to determine whether a resource has been correctly classified.
10 An adaptive interface device according to claim 9 wherein the resource analysis means is adapted to either negatively or positively list users who respectively correctly or incorrectly categorise resources.
11 An adaptive interface device according to claim 10 wherein should a user incorrectly categorises resources on one or more occasions the resource analysis means may be adapted to ignore further categorisations from that user, or notify an administrator of the system of that user's deliberate miss-categorisation.
12 An adaptive interface device according to claim 11 wherein the resource analysis means negates previous categorisations by that user.
13 An adaptive interface device according to any one of claims 9 to 12 wherein the probability analysis algorithm is a Bayesian or Fisher/Robinson probability analysis algorithm. 14 An adaptive interface device according to any one of claims 2 to 13 which includes override means which selected authorised users may access to override said access control means to thereby allow access or deny access of selected resources to individual users or groups of users over whom they are charged with responsibility. 15 An adaptive interface device according to claim 14 wherein the storage means is adapted to keep a running total of the number of authorised users who choose to influence the accessibility index either positively or negatively, and to provide a positive accessibility index when certain criteria are met.
16 An adaptive interface device according to claim 15 wherein access is allowed if the number of authorised users who influence the index positively exceeds the numbers of authorised users who influence the index negatively.
17 An adaptive interface device according to claim 15 or 16 wherein access is denied if the number of authorised users who influence the index negatively exceeds the numbers of authorised users who influence the index positively. 18 A system incoφorating an adaptive interface device according to any preceding claim, and a plurality of individual computers, each of said individual computers being linked to said adaptive interface device via a local interface device, said local interface device being adapted to allow or deny access to any individual resource, irrespective of the global accessibility index for that resource. 19 A system according to claim 18 wherein said local interface device is located in the software of said individual computers or within the software of said adaptive interface device, and is operable using a password or similar control device.
20 A system according to claim 18 or 19 wherein each individual computer includes a password controlled override facility which is adapted to access the override means in order to allow or deny access to individual resources, irrespective of the state of the global accessibility index for that resource.
21 A system according to any one of claims 18 to 20 which includes independent override means, which is operable independently of said individual computers to deny access to particular resources, or categories of resources, by at least some of said individual computers, irrespective of the accessibility index for that resource, said independent override means not being able to be overridden by said local interface device.
22 An adaptive interface device substantially as hereinbefore described with reference to the accompanying drawing.
PCT/AU2004/000791 2003-06-19 2004-06-17 Method and system for barring access to selected internet resources WO2004111870A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2003903104 2003-06-19
AU2003903104A AU2003903104A0 (en) 2003-06-19 2003-06-19 Method and system for barring access to selected internet resources

Publications (1)

Publication Number Publication Date
WO2004111870A1 true WO2004111870A1 (en) 2004-12-23

Family

ID=31954143

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2004/000791 WO2004111870A1 (en) 2003-06-19 2004-06-17 Method and system for barring access to selected internet resources

Country Status (2)

Country Link
AU (1) AU2003903104A0 (en)
WO (1) WO2004111870A1 (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110289216A1 (en) * 2010-05-21 2011-11-24 Timothy Szeto System and Method for Generating Subnets and Using Such Subnets for Controlling Access to Web Content
WO2012117155A1 (en) * 2011-02-28 2012-09-07 Nokia Corporation Method and apparatus for providing a proxy-based access list
WO2016203474A1 (en) * 2015-06-18 2016-12-22 Googale (2009) Ltd Secured computerized system for children and/or pre- literate/ illiterate users
US9871798B2 (en) 2015-06-18 2018-01-16 Googale (2009) Ltd. Computerized system facilitating secured electronic communication between and with children
US10198963B2 (en) 2015-06-18 2019-02-05 Googale (2009) Ltd. Secure computerized system, method and computer program product for children and/or pre-literate/illiterate users
US10853029B2 (en) 2015-06-18 2020-12-01 Googale (2009) Ltd. Computerized system including rules for a rendering system accessible to non-literate users via a touch screen

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5706507A (en) * 1995-07-05 1998-01-06 International Business Machines Corporation System and method for controlling access to data located on a content server
GB2365172A (en) * 2000-01-06 2002-02-13 Ibm Method system and program for filtering content using neural networks
US6564327B1 (en) * 1998-12-23 2003-05-13 Worldcom, Inc. Method of and system for controlling internet access

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5706507A (en) * 1995-07-05 1998-01-06 International Business Machines Corporation System and method for controlling access to data located on a content server
US6564327B1 (en) * 1998-12-23 2003-05-13 Worldcom, Inc. Method of and system for controlling internet access
GB2365172A (en) * 2000-01-06 2002-02-13 Ibm Method system and program for filtering content using neural networks

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110289216A1 (en) * 2010-05-21 2011-11-24 Timothy Szeto System and Method for Generating Subnets and Using Such Subnets for Controlling Access to Web Content
WO2012117155A1 (en) * 2011-02-28 2012-09-07 Nokia Corporation Method and apparatus for providing a proxy-based access list
WO2016203474A1 (en) * 2015-06-18 2016-12-22 Googale (2009) Ltd Secured computerized system for children and/or pre- literate/ illiterate users
US9871798B2 (en) 2015-06-18 2018-01-16 Googale (2009) Ltd. Computerized system facilitating secured electronic communication between and with children
US10198963B2 (en) 2015-06-18 2019-02-05 Googale (2009) Ltd. Secure computerized system, method and computer program product for children and/or pre-literate/illiterate users
US10726118B2 (en) 2015-06-18 2020-07-28 Googale (2009) Ltd. Secured computerized system for children and/or pre-literate/illiterate users
US10853029B2 (en) 2015-06-18 2020-12-01 Googale (2009) Ltd. Computerized system including rules for a rendering system accessible to non-literate users via a touch screen

Also Published As

Publication number Publication date
AU2003903104A0 (en) 2003-07-03

Similar Documents

Publication Publication Date Title
US8539329B2 (en) Methods and systems for web site categorization and filtering
US6606659B1 (en) System and method for controlling access to internet sites
US9087129B2 (en) Methods, systems, and software for automated growth of intelligent on-line communities
AU771963B2 (en) System and method for controlling access to internet sites
Ghazinour et al. YOURPRIVACYPROTECTOR, A recommender system for privacy settings in social networks
Proctor et al. Examining usability of web privacy policies
JP2008519332A (en) Search system and method integrating user judgment including a trust network
JPH0926975A (en) System and method for database access control
CA2767529A1 (en) System and method for providing customized response messages based on requested website
US7181513B1 (en) Restricting access to requested resources
JP6752788B2 (en) Systems and methods for implementing privacy firewalls
US20120278298A9 (en) System and method for query temporality analysis
WO2004111870A1 (en) Method and system for barring access to selected internet resources
Brando Is child disenfranchisement justified?
Watters et al. Controlling information behaviour: the case for access control
WO2008144225A1 (en) Combined personal and community lists
Eneman Internet service provider (ISP) filtering of child-abusive material: A critical reflection of its effectiveness
Hildén Am I my IP address's keeper? Revisiting the boundaries of information privacy
Babic et al. Building robust authentication systems with activity-based personal questions
Racine et al. What are PETs for privacy experts and non-experts
Alshuaibi et al. Internet misuse at work in Jordan: Challenges and implications
Rich South Korean Perceptions of Misinformation on Social Media: The Limits of a Consensus?
Horten Algorithms patrolling content: where’s the harm?
Spradling et al. Evaluation of elements of a prospective system to alert users to intentionally deceptive content
US9927956B2 (en) System and method for categorizing and ranking content for presentation

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
122 Ep: pct application non-entry in european phase