[PATCH 3/3] net: hisilicon: new hip04 ethernet driver
Arnd Bergmann
arnd at arndb.de
Thu Mar 20 10:31:20 EDT 2014
On Thursday 20 March 2014, Zhangfei Gao wrote:
> On Tue, Mar 18, 2014 at 7:25 PM, Arnd Bergmann <arnd at arndb.de> wrote:
> > On Tuesday 18 March 2014 16:40:17 Zhangfei Gao wrote:
> >
> >> +
> >> +static void __iomem *ppebase;
> >
> > The global 'ppebase' seems hacky. Isn't that a SoC-specific register area, while
> > the rest of the driver is reusable across SoCs?
> >
> > What does 'ppe' stand for?
> >
> > What if there are multiple instances of this, which each have their own ppebase?
>
> In this specific platform,
> ppe is the only module controlling all the fifos for all the net controller.
> And each controller connect to specific port.
> ppe has 2048 channels, sharing by all the controller, only if not overlapped.
>
> So the static ppebase is required, which I don't like too.
> Two inputs required, one is port, which is connect to the controller.
> The other is start channel, currently I use id, and start channel is
> RX_DESC_NUM * priv->id; /* start_addr */
Ok, thanks for the explanation!
> > This also seems to encode knowledge about a particular implementation
> > into the driver. Maybe it's better to add a property for the port
> > mode?
>
> After check Documentation/devicetree/bindings/net/ethernet.txt,
> I think phy-mode is more suitable here.
>
> switch (priv->phy_mode) {
> case PHY_INTERFACE_MODE_SGMII:
> if (speed == SPEED_1000)
> val = 8;
> else
> val = 7;
> break;
> case PHY_INTERFACE_MODE_MII:
> val = 1; /* SPEED_100 */
> break;
> default:
> val = 0;
> break;
> }
> writel_relaxed(val, priv->base + GE_PORT_MODE);
>
> probe:
> priv->phy_mode = of_get_phy_mode(node);
Yes, this looks better, but where does 'speed' come from now? I assume
even in SGMII mode, you should allow autonegotiation and set this correctly
from the PHY code. Is that what you do here?
> >> +static void hip04_set_xmit_desc(struct hip04_priv *priv, dma_addr_t phys)
> >> +{
> >> + writel_relaxed(phys, priv->base + PPE_CFG_TX_PKT_BD_ADDR);
> >> +}
> >> +
> >> +static void hip04_set_recv_desc(struct hip04_priv *priv, dma_addr_t phys)
> >> +{
> >> + writel_relaxed(phys, ppebase + priv->port * 4 + PPE_CFG_RX_CFF_ADDR);
> >> +}
> >> +
> >> +static u32 hip04_recv_cnt(struct hip04_priv *priv)
> >> +{
> >> + return readl_relaxed(priv->base + PPE_HIS_RX_PKT_CNT);
> >> +}
> >
> > At the very least, the hip04_set_xmit_desc() function needs to use 'writel'
> > rather than 'writel_relaxed'. Otherwise data that is being sent out
> > can be stuck in the CPU's write buffers and you send stale data on the wire.
> >
> > For the receive path, you may or may not need to use 'readl', depending
> > on how DMA is serialized by this device. If you have MSI interrupts, the
> > interrupt message should already do the serialization, but if you have
> > edge or level triggered interrupts, you normally need to have one readl()
> > from the device register between the IRQ and the data access.
>
> Really, will update to readl/wirtel in xmit and hip04_rx_poll.
> I just got impression, use *_relaxed as much as possible for better performance.
Ok. The _relaxed() versions are really meant for people that understand
the ordering requirements. The regular readl/writel accessors contain
extra barriers to make them equivalent to what x86 does.
> >> +static int hip04_rx_poll(struct napi_struct *napi, int budget)
> >> +{
> >> + struct hip04_priv *priv = container_of(napi,
> >> + struct hip04_priv, napi);
> >> + struct net_device *ndev = priv->ndev;
> >> + struct sk_buff *skb;
> >> + struct rx_desc *desc;
> >> + unsigned char *buf;
> >> + int rx = 0;
> >> + unsigned int cnt = hip04_recv_cnt(priv);
> >> + unsigned int len, tmp[16];
> >> +
> >> + while (cnt) {
> >> + buf = priv->rx_buf[priv->rx_head];
> >> + skb = build_skb(buf, priv->rx_buf_size);
> >> + if (unlikely(!skb))
> >> + net_dbg_ratelimited("build_skb failed\n");
> >> + dma_map_single(&ndev->dev, skb->data,
> >> + RX_BUF_SIZE, DMA_FROM_DEVICE);
> >> + memcpy(tmp, skb->data, 64);
> >> + endian_change((void *)tmp, 64);
> >> + desc = (struct rx_desc *)tmp;
> >> + len = desc->pkt_len;
> >
> > The dma_map_single() seems misplaced here, for all I can tell, the
> > data has already been transferred. Maybe you mean dma_unmap_single?
> >
> > I don't see why you copy 64 bytes out of the buffer using endian_change,
> > rather than just looking at the first word, which seems to have the
> > only value you are interested in.
> Russell suggested using be16_to_cpu etc to replace memcpy.
Right, but doesn't it have to be be32_to_cpu?
> >> +static int hip04_alloc_ring(struct net_device *ndev, struct device *d)
> >> +{
> >> + struct hip04_priv *priv = netdev_priv(ndev);
> >> + int i;
> >> +
> >> + priv->rx_buf_size = RX_BUF_SIZE +
> >> + SKB_DATA_ALIGN(sizeof(struct skb_shared_info));
> >> +
> >> + priv->desc_pool = dma_pool_create(DRV_NAME, d, sizeof(struct tx_desc),
> >> + SKB_DATA_ALIGN(sizeof(struct tx_desc)), 0);
> >> + if (!priv->desc_pool)
> >> + return -ENOMEM;
> >> +
> >> + for (i = 0; i < TX_DESC_NUM; i++) {
> >> + priv->td_ring[i] = dma_pool_alloc(priv->desc_pool,
> >> + GFP_ATOMIC, &priv->td_phys[i]);
> >> + if (!priv->td_ring[i])
> >> + return -ENOMEM;
> >> + }
> >
> > Why do you create a dma pool here, when you do all the allocations upfront?
> >
> > It looks to me like you could simply turn the td_ring array of pointers
> > to tx descriptors into a an array of tx descriptors (no pointers) and allocate
> > that one using dma_alloc_coherent.
>
> dma pool used here mainly because of alignment,
> the desc has requirement of SKB_DATA_ALIGN,
> so use the simplest way
> priv->desc_pool = dma_pool_create(DRV_NAME, d, sizeof(struct tx_desc),
> SKB_DATA_ALIGN(sizeof(struct tx_desc)), 0);
dma_alloc_coherent() will actually give you PAGE_SIZE alignment, so that's
still easier.
> >
> >
> >> + if (!ppebase) {
> >> + struct device_node *n;
> >> +
> >> + n = of_find_compatible_node(NULL, NULL, "hisilicon,hip04-ppebase");
> >> + if (!n) {
> >> + ret = -EINVAL;
> >> + netdev_err(ndev, "not find hisilicon,ppebase\n");
> >> + goto init_fail;
> >> + }
> >> + ppebase = of_iomap(n, 0);
> >> + }
> >
> > How about using syscon_regmap_lookup_by_phandle() here? That way, you can have
> > a more generic abstraction of the ppe, and stick the port and id in there as
> > well, e.g.
> >
> > ppe-syscon = <&hip04ppe 1 4>; // ppe, port, id
>
> To be honest, I am not familiar with syscon_regmap_lookup_by_phandle.
>
> Florian has suggested
> ppe: ppe at 28c0000 {
> compatible = "hisilicon,hip04-ppe";
> reg = <0x28c0000 0x10000>;
> #address-cells = <1>;
> #size-cells = <0>;
>
> eth0_port: port at 1f {
> reg = <31>;
minor comment: I'd use 0x1f for the reg too.
> };
>
> eth1_port: port at 0 {
> reg = <0>;
> };
>
> eth2_port: port at 8 {
> reg = <8>;
> };
> };
> fe: ethernet at 28b0000 {
> compatible = "hisilicon,hip04-mac";
> reg = <0x28b0000 0x10000>;
> interrupts = <0 413 4>;
> phy-mode = "mii";
> port-handle = <ð0_port>;
> };
> And the port info can be found from port-handle
> n = of_parse_phandle(node, "port-handle", 0);
> ret = of_property_read_u32(n, "reg", &priv->port);
>
> The id is the controller start channel in ppe, either use alias or
> another property in the port.
Yes, this seems fine as well, as long as you are sure that the ppe
device is only used by hip04 ethernet devices, I think it
would get messy if your hardware shares them with other units.
It's probably a little simpler to avoid the sub-nodes and instead do
> ppe: ppe at 28c0000 {
> compatible = "hisilicon,hip04-ppe";
> reg = <0x28c0000 0x10000>;
> #address-cells = <1>;
> #size-cells = <0>;
> };
> fe: ethernet at 28b0000 {
> compatible = "hisilicon,hip04-mac";
> reg = <0x28b0000 0x10000>;
> interrupts = <0 413 4>;
> phy-mode = "mii";
> port-handle = <&ppe 31>;
> };
In the code, I would create a simple ppe driver that exports
a few functions. you need. In the ethernet driver probe() function,
you should get a handle to the ppe using
/* look up "port-handle" property of the current device, find ppe and port */
struct hip04_ppe *ppe = hip04_ppe_get(dev->of_node);
if (IS_ERR(ppe))
return PTR_ERR(ptr); /* this handles -EPROBE_DEFER */
and then in other code you can just do
hip04_ppe_set_foo(priv->ppe, foo_config);
This is a somewhat more high-level abstraction that syscon, which
just gives you a 'struct regmap' structure for doing register-level
configuration like you have today.
> >> + ret = of_property_read_u32(node, "speed", &val);
> >> + if (ret) {
> >> + dev_warn(d, "not find speed info\n");
> >> + priv->speed = SPEED_1000;
> >> + }
> >> +
> >> + if (SPEED_100 == val)
> >> + priv->speed = SPEED_100;
> >> + else
> >> + priv->speed = SPEED_1000;
> >> + priv->duplex = DUPLEX_FULL;
> >
> > Why do you even need the speed here, shouldn't you get that information
> > from the phy through hip04_adjust_link?
>
> There still 100M mac does not have phy, as well adjust_link.
> Will use phy-mode instead of speed.
Ok.
Arnd
More information about the linux-arm-kernel
mailing list