subdomain.domain.tld regexp?

Ask a Question related to PERL Modules, Design and Development.

  1. #1

    Default subdomain.domain.tld regexp?

    Is there a module that can split a given host in a subdomain, domain, tld
    part?

    e.g.

    [url]www.google.com[/url] => www google com
    [url]www.google.co.uk[/url] => www google.co uk
    [url]www.google.nl[/url] => www google nl
    [url]www.asus.com.tw[/url] => www asus.com tw

    (yup, technically the com? is probably the domain, and www a sub-sub
    domain)

    --
    John Small Perl scripts: [url]http://johnbokma.com/perl/[/url]
    Perl programmer available: [url]http://castleamber.com/[/url]
    Happy Customers: [url]http://castleamber.com/testimonials.html[/url]

    John Bokma Guest

  2. Similar Questions and Discussions

    1. Subdomain Connections
      I am reopening a problem raised last year on this forum. When I try to edit a page in a subdomain (my-subdomain.my-domain.com/my-page.htm), it...
    2. subdomain problem
      Hello folks... I'm using Contribute 3.11 on the Mac, and I have a problem with subdomains. I'm the admin for a local school, and I have a connection...
    3. New subdomain does not display cfm pages
      I created 2 new subdomain on our IIS 5 server running ColdFusion 5.0 These 2 new subdomains are parsing HTML pages fine, however, they are not...
    4. Subdomain Experts
      I want to set up users with subdomains (user.domain.tld). My DNS can be modified by me using IP, CNAME or TXT. I have shared hosting so if...
    5. Regexp to find the subdomain (if any) from $HTTP_HOST
      I need help creating a regular expression to find the subdomain in the $HTTP_HOST variable. The domain is always in this form: www.domain.com...
  3. #2

    Default Re: subdomain.domain.tld regexp?



    John Bokma wrote:
    > Is there a module that can split a given host in a subdomain, domain, tld
    > part?
    >
    > e.g.
    >
    > [url]www.google.com[/url] => www google com
    > [url]www.google.co.uk[/url] => www google.co uk
    > [url]www.google.nl[/url] => www google nl
    > [url]www.asus.com.tw[/url] => www asus.com tw
    >
    > (yup, technically the com? is probably the domain, and www a sub-sub
    > domain)
    DNS is strictly hierachical. As far as DNS is concerned 'uk', 'co.uk',
    'google.co.uk' and 'www.google.co.uk' are all just domains.

    You need to take a step back and consider what you really want to do. I
    suspect you have no idea.

    To split a dot-delimited string into a list of components:

    my @components = split /\./, $domain_name;

    Do you perhaps want to find the DNS zone boundries? If so why do you
    think this would be helpful?


    Brian McCauley Guest

  4. #3

    Default Re: subdomain.domain.tld regexp?

    Brian McCauley wrote:
    > John Bokma wrote:
    >
    >> Is there a module that can split a given host in a subdomain, domain,
    >> tld part?
    >>
    >> e.g.
    >>
    >> [url]www.google.com[/url] => www google com
    >> [url]www.google.co.uk[/url] => www google.co uk
    >> [url]www.google.nl[/url] => www google nl
    >> [url]www.asus.com.tw[/url] => www asus.com tw
    >>
    >> (yup, technically the com? is probably the domain, and www a sub-sub
    >> domain)
    >
    > DNS is strictly hierachical. As far as DNS is concerned 'uk',
    > 'co.uk', 'google.co.uk' and 'www.google.co.uk' are all just domains.
    Yeah I know.
    > You need to take a step back and consider what you really want to do.
    > I suspect you have no idea.
    Because you don't have an idea doesn't mean I haven't one.

    I want to get the part one can register as, say, a company.

    in the uk I can't register

    somename.uk

    but I can register:

    somename.co.uk

    In nl however, I can register:

    somename.nl

    --
    John Small Perl scripts: [url]http://johnbokma.com/perl/[/url]
    Perl programmer available: [url]http://castleamber.com/[/url]
    Happy Customers: [url]http://castleamber.com/testimonials.html[/url]

    John Bokma Guest

  5. #4

    Default Re: subdomain.domain.tld regexp?



    John Bokma wrote:
    > Brian McCauley wrote:
    >
    >
    >>John Bokma wrote:
    >>
    >>
    >>>Is there a module that can split a given host in a subdomain, domain,
    >>>tld part?
    >>>
    >>>e.g.
    >>>
    >>>[url]www.google.com[/url] => www google com
    >>>[url]www.google.co.uk[/url] => www google.co uk
    >>>[url]www.google.nl[/url] => www google nl
    >>>[url]www.asus.com.tw[/url] => www asus.com tw
    >>>
    >>>(yup, technically the com? is probably the domain, and www a sub-sub
    >>>domain)
    >>
    >>DNS is strictly hierachical. As far as DNS is concerned 'uk',
    >>'co.uk', 'google.co.uk' and 'www.google.co.uk' are all just domains.
    >
    >
    > Yeah I know.
    >
    >
    >>You need to take a step back and consider what you really want to do.
    >>I suspect you have no idea.
    >
    > Because you don't have an idea doesn't mean I haven't one.
    No, but the fact that your OP does not convey your idea is often a good
    sign that you haven't actually crystalised your idea.
    > I want to get the part one can register as, say, a company.
    This is a question about the policy of issuing subdomains under a
    domain. There may or may not be a standard way for a domain to publish
    its subdomain policy. There certainly is no widely used standard.
    If there were a widely used standard then there could be a Perl module
    to interface to it.
    > in the uk I can't register
    >
    > somename.uk
    You can't, but some entities can. Sufficiently few uk 2nd-LDs are
    issued I suspect each application is considered on it's merits rather
    than there being a simple algotithm.

    There are also domains like ac.uk which accept registrations from a
    restricted class of entities.

    Someone could compose a list of domains that accept subdomain
    registration from the general populace but this could simply be a plain
    text file rather than a Perl module.

    Note: there are situations like uk.com where both uk.com and com accept
    registrations.

    Brian McCauley Guest

  6. #5

    Default Re: subdomain.domain.tld regexp?

    @Brian McCauley: How about reading between the lines a little and providing a /helpful/ answer instead of rhetoric. It's not that hard to see that what he wants is just the domain part of a domain name and not the host part.

    @John Bokma: While there is no perl module that I am aware of that will give you what you seek, you can certainly get most of the way there with regular expressions:
    Code:
    foreach (qw(a.b.co.uk a.b.uk.com a.b.com b.co.uk b.uk.com b.com z.y.x.a.b.c.co.uk z.y.x.a.b.c.uk.com) ) {
      printf("%-19s: %s\n", $_, ($_ =~ m/
        (?|                                              # branch reset (perl 5.10+ specific), also doesn't count as a capture group
          (?=^[a-z][a-z0-9-]*\.[a-z]{2,3}$)(.*)          # check for domain.tld or domain.tl and capture it (as $1)
          |                                              
          (?:^[a-z][a-z0-9-]*\.)*?                       # skip past subdomain parts (with a non capturing group)
          (?|                                            # branch reset (perl 5.10+ specific), also doesn't count as a capture group 
           (?=[a-z][a-z0-9-]*\.[a-z]{2}\.[a-z]{2,3})(.*) # look ahead for host.uk.com or host.co.uk and capture it
           |                                             
           (?=[a-z][a-z0-9-]*\.[a-z]{3})(.*)             # look ahead for host.com and capture it
          )
        )$
      /x ? $1 : "no match")); 
    }
    Cheers!,

    --
    Carl
    Carl Corliss Guest

Posting Permissions

  • You may not post new threads
  • You may post replies
  • You may not post attachments
  • You may not edit your posts

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139